Abstract
Automatic recognition of insect sound could help us understand changing biodiversity trends around the world—but insect sounds are challenging to recognize even for deep learning, due to the broad frequency ranges and limited amount of training data. We present a new dataset comprised of 26298 audio files (226.6 hours), from 459 species of Orthoptera (310 species) and Cicadidae (149 species). InsectSet459 is the first large-scale dataset of insect sound that is easily applicable for developing novel deep-learning methods. Its recordings were made with a variety of audio recorders using varying sample rates to capture the extremely broad range of frequencies that insects produce. We benchmark performance with two state-of-the-art deep learning classifiers, demonstrating good performance but also significant room for improvement in acoustic insect classification. This dataset can serve as a realistic test case for implementing insect monitoring workflows, and as a challenging basis for the development of audio representation methods that can handle highly variable frequencies and/or sample rates.
| Original language | English |
|---|---|
| Article number | 499 |
| Number of pages | 9 |
| Journal | Scientific Data |
| Volume | 13 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 27 Mar 2026 |
Fingerprint
Dive into the research topics of 'A dataset of insect sounds from 459 species for bioacoustic machine learning'. Together they form a unique fingerprint.Projects
- 1 Active
-
BioacAI: Bioacoustic AI (MSCA Doctoral Network)
Stowell, D. (Principal Investigator)
1/09/23 → 1/09/27
Project: Research project
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver