Skip to main navigation Skip to search Skip to main content

A dataset of insect sounds from 459 species for bioacoustic machine learning

  • Marius Faiß*
  • , Burooj Ghani
  • , Dan Stowell
  • *Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Automatic recognition of insect sound could help us understand changing biodiversity trends around the world—but insect sounds are challenging to recognize even for deep learning, due to the broad frequency ranges and limited amount of training data. We present a new dataset comprised of 26298 audio files (226.6 hours), from 459 species of Orthoptera (310 species) and Cicadidae (149 species). InsectSet459 is the first large-scale dataset of insect sound that is easily applicable for developing novel deep-learning methods. Its recordings were made with a variety of audio recorders using varying sample rates to capture the extremely broad range of frequencies that insects produce. We benchmark performance with two state-of-the-art deep learning classifiers, demonstrating good performance but also significant room for improvement in acoustic insect classification. This dataset can serve as a realistic test case for implementing insect monitoring workflows, and as a challenging basis for the development of audio representation methods that can handle highly variable frequencies and/or sample rates.
Original languageEnglish
Article number499
Number of pages9
JournalScientific Data
Volume13
Issue number1
DOIs
Publication statusPublished - 27 Mar 2026

Fingerprint

Dive into the research topics of 'A dataset of insect sounds from 459 species for bioacoustic machine learning'. Together they form a unique fingerprint.

Cite this