Last week, iNaturalist hit an exciting milestone—1,000,000 observations with sound!
While this is a small fraction compared to the number of image-based observations, it’s a significant contribution to global biodiversity monitoring. In fact, iNaturalist is now the second-largest provider of sound recordings to the Global Biodiversity Information Facility (GBIF) over the past decade. While initiatives like Macaulay Library at Cornell Lab of Ornithology and FrogID at the Australian Museum also contribute vast sound-generated point records to GBIF, the datasets on the graph below uniquely share with GBIF the sound recordings themselves.
The Growing Role of Sound on iNaturalist
Sound is becoming an increasingly important tool for biodiversity documentation on iNaturalist. Here's how it's being used and our vision for the future.
Using iNaturalist to Record and Annotate Sounds: Case Study from Panama
To explore how iNaturalist is helping record and annotate sounds, we spoke with Brian Gratwicke (@briangratwicke), a long-time iNaturalist user and amphibian conservation lead at the Smithsonian’s National Zoo and Conservation Biology Institute. In Altos de Campana National Park, Panama, where amphibian populations have been devastated by chytrid fungus, Brian and his colleagues Roberto Ibáñez (@ibanezr) and Jorge Guerrel (@jorge_guerrel) have been using sound recordings to make remarkable discoveries.
Recently, the team rediscovered and recorded calls from the Boquete rocket frog (Silverstoneia nubicola), a species that hadn’t been detected in the park for years. They also recorded the calls of the abundant Rainforest rocket frog (Silverstoneia flotator), which has a very similar call. Roberto Ibáñez, a leading expert on frog calls in Panama who has been studying them since the 1980s, is one of the few who can distinguish these species by sound alone.
So far, around 100 contributors have submitted 261 sound observations of 47 out of 188 frog species from Panama. Our goal is to make iNaturalist an even more valuable tool for collecting sound vouchers and annotations, which we hope will attract more experts like Roberto to share their amphibian call expertise on iNaturalist.
Looking Ahead: Sound and AI on iNaturalist
The future of sound on iNaturalist is bright. Grant Van Horn (@gvanhorn), a longtime collaborator on iNaturalist's computer vision projects and creator of Merlin Sound ID, recently worked with iNaturalist staff member Alex Shepard (@alexshepard) and colleagues from the University of Massachusetts Amherst to publish a paper on the iNaturalist Sounds Dataset. This paper, focused on building sound datasets for advancing AI sound models, was just accepted to NeurIPS 2024, one of the world’s top conferences on machine learning and AI. A preprint on arXiv will be available later this month and we’ll share the link here once it’s live.
Our long-term vision is to elevate sound to the same status as images on iNaturalist. We’re committed to developing tools that will make it easier for the community to record, annotate, and showcase sounds. We aim to leverage these data to power the next generation of AI sound models. These models will not only enhance the iNaturalist platform but also be shared with the broader scientific and conservation community.
By the end of 2024, we project that iNaturalist’s computer vision and geo models will cover 100,000 species. Even building an AI sound model capable of accurately identifying 10% of that—around 10,000 species—could be transformative for bioacoustics research.
Join Us in Shaping the Future of Bioacoustics
Can the iNaturalist community rally to generate the data needed for a 10,000-species sound model? We believe the answer is yes. With the right tools, outreach, and collaboration, we can achieve this ambitious goal together. Let's continue working together to expand the power of sound in conservation and biodiversity research!
Tips for Contributing Sound Observations
Identifying species by sight can be tricky, and sound adds an extra layer of challenge! Follow these simple tips to make identification easier for the iNaturalist community:
- Recording Techniques: Get as close as possible to your subject without disturbing it. Stand still and keep quiet to minimize background noise like footsteps, clothing rustle, or other sounds that could obscure your subject’s sound. Point your microphone toward the sound source, which may mean pointing the bottom of your phone toward your subject. Aim for recordings of at least 10 seconds—or ideally 30 seconds if the subject stays put—as longer samples can help with identification.
- Recording Diversity: To help us build a complete picture of each species’ sounds, record different individuals across various locations and times of year. Many shorter recordings from diverse settings are far more useful than a few lengthy ones from the same spot.
- Background Species: While it’s not required, going above and beyond by adding notes about any background species you hear can be incredibly valuable. Even when these sounds overlap with your target subject, they provide important context about the environment and help future listeners better interpret your recording. This extra detail also contributes to the development of machine learning models that recognize all species vocalizing, not just the target species.
- File Format: If you’re uploading sounds recorded outside the iNaturalist app, please use WAV files with a minimum sample rate of 44.1kHz.
Posted on October 4, 2024 11:32 PM by loarie
Comments
Reason why I don't upload audio : I have to extract from the video which is very time consuming... definitely observation with sound is very important and awesome but someday... video support with limited length? 🥺 (yeah... even compressed video, it will take many storage)
Posted by miyrumiyru about 17 hours ago
I do a lot of sound recording because I'm a Merlin user. But nobody IDs sound on iNaturalist, so I didn't really think it was being used. I have a huge backlog I could upload. Are there any taxa you'd especially like to see documented? I'll try to get a start this weekend.
Posted by nancylightfoot about 17 hours ago
We have good bird audio identifiers in my state of Minnesota. So I'm lucky there.
Way less lucky with mammal sounds (mostly rodents - chipmunks, etc) and insect (grasshoppers, cicadas, etc).
I also hold off on audio unless I'm personally curious but I'm willing to collect more if we can get more identifiers.
Posted by mmmiller about 16 hours ago
Why not mp3 files? I use a small portable Olympus WS853 recorder (when uploading files outside of my iNat app). I know the mp3s are compressed files but they serve the purpose of identifications most of the time. Why let perfection be the enemy of the good? :)
Posted by ragkannan about 16 hours ago
Yay!! Congrats!!
Posted by texas_nature_family about 15 hours ago
@ragkannan I think WAV is just a preference, not a requirement.
Anyway, the future plans with sound are so awesome -- I can't wait for the day where there's enough sound observations for an identification model to be functional! Especially hoping for spectrogram display to be a thing someday, like Macaulay Library -- it's super super helpful when doing identifications by sound.
Posted by cigazze about 15 hours ago
Agreed @cigazze! Spectograms are SO helpful!
Posted by texas_nature_family about 15 hours ago
As someone with ~3000 sound observations and ~4000 sound identifications, I have a few suggestions that will help:
1) Fix the poor sound quality problem of the app and/or uploading. Compression artifacts are easily heard and valuable high frequencies are being discarded.
2) Enable a built-in spectrogram analyzer. This will greatly speed identifications.
3) Make it easier to harvest the vast quantity of bycatch species in existing sound recordings. It could be made much easier.
Posted by dan_johnson about 13 hours ago
One tricky group to deal with is bats. Since bat sounds aren't audible to humans, some people just upload spectrograms instead. My preferred solution is to upload them at 1/10th speed. I wonder though if this would potentially cause problems for the AI models or if they would be able to recognize the patterns regardless. Any bat people here? What are your thoughts?
Posted by zygy about 12 hours ago
@zygy for bats in Africa there is https://www.inaturalist.org/people/jakob
Posted by dianastuder about 10 hours ago
I try to upload sound files for bird observations with additional confirmation from Xeno-canto and Merlin/Birdnet whenever possible. Let us try and build a better library of sounds, so that the software can be trained better, images with sound will definitely help as well. Looking forward to iNaturalist's own sound recognition tool in near future. Cheers!
Posted by gs5 about 7 hours ago
We also have almost a thousand of observations of plants with sound. While there are some legit ones (popping seeds for example), I clicked handful of random ones and it was always some 2-second random noise that someone uploaded by mistake with some photos. Not sure what to do about this, but at this point, there is a clearly not a very clear collection of sounds on iNat.
Posted by opisska about 6 hours ago
@opisska Research Grade should deal with that issue. If there are Research Grade sounds for plants with no detectable plant noise, identify the other species that can be heard, or mark No Evidence of Organism. Needs ID sounds are unlikely to be used for training
Posted by deboas about 5 hours ago
@deboas - the observations have many good pictures and then there is the nonsensical sound recording. So they are fully OK being research grade, but if you are looking for sounds, they are not helpful.
Posted by opisska about 5 hours ago
@opisska Got you! I can see that being a problem. I can see a case for marking "no" for "Evidence related to a single subject" in such situations
Posted by deboas about 5 hours ago
@deboas - I just don't want to casualise an entire observation for this. But as far as I know, it's not possible to flag just a single item in an observation, right?
Posted by opisska about 3 hours ago
I understand that Merlin uses the spectrogram image for analysis rather than the the audio directly. Is the plan to do the same with AI/CV and iNat audio? I know on the forum discussion about incorporating spectrograms there were concerns about bats and other high pitched species, similar to what @zygy expressed above. But as others have said this update is exciting!
I've been working on learning and recording singing insects in Ontario, but I find it more slow and tedious than photo ID because of a combination of lack of identifiers and inability to visualize the sounds (I find I learn things a lot easier visually than aurally).
Posted by upupa-epops about 2 hours ago
I'll be very excited to see audio AI come to the platform in one way or another. Merlin is great for birds, but I've often wished I had something similar for frogs, mammals, and insects.
Regarding plant observations with unrelated audio clips, this feels like another case where per-photo annotation (as opposed to per-observation) would be helpful in sorting and filtering data.
Posted by guerrichache about 2 hours ago
Add a Comment
Sign In or Sign Up to add comments