Project Details
Description
PhD project, part of the Bioacoustic AI doctoral Network.
Short summary:
Using deep learning in computational bioacoustics to help find and identify previously unheard species and sounds.
Long summary:
Standard machine-learning classifiers recognise a fixed list of categories - but almost all acoustic monitoring should really be considered an “open-set” problem. New categories arise for many reasons: undiscovered and rare species,
previously-unheard vocal behaviours, anomalies, even newly-arising species. This is especially important in tropical and Global South locations but is also true for any acoustic monitoring at large scale. Our partners have large acoustic datasets recorded in forests and protected areas with expert annotation of only a small portion of the time or species detail. We will: (1) Survey and investigate machine learning methods that can coherently accommodate novel/unknown fine-grained categories, including: hierarchical classification plus explicit “unknown”-categories; clustering algorithms and their relation to “sonotypes”; anomaly detection algorithms; general-purpose embeddings (Sethi et al 2020). (2) Investigate how to reduce biases in sensitivity of classification and novel-class
detection, e.g. optimise for equal probability of detecting new
birds/insects/anurans. (3) Investigate differing issues of desired and undesired novel classes, e.g. new species versus traffic noise. (4) Analyse large multi-year
acoustic datasets: recorded by NCACR in Czechia and MNHN in Jura/French Guiana, validating our algorithms and producing data-driven analyses for the partners. (5) Implement an “open-set” sound identification workflow that is compatible with public machine learning APIs. (6) Validate the workflow using synthetic scenarios, including both dataset-driven case studies and user-driven studies with novice and expert users. (7) Implement user-friendly tools to automate and enhance monitoring processes, demonstrating a live example of an AI pipeline that wildlife NGOs & GOs can adopt.
Short summary:
Using deep learning in computational bioacoustics to help find and identify previously unheard species and sounds.
Long summary:
Standard machine-learning classifiers recognise a fixed list of categories - but almost all acoustic monitoring should really be considered an “open-set” problem. New categories arise for many reasons: undiscovered and rare species,
previously-unheard vocal behaviours, anomalies, even newly-arising species. This is especially important in tropical and Global South locations but is also true for any acoustic monitoring at large scale. Our partners have large acoustic datasets recorded in forests and protected areas with expert annotation of only a small portion of the time or species detail. We will: (1) Survey and investigate machine learning methods that can coherently accommodate novel/unknown fine-grained categories, including: hierarchical classification plus explicit “unknown”-categories; clustering algorithms and their relation to “sonotypes”; anomaly detection algorithms; general-purpose embeddings (Sethi et al 2020). (2) Investigate how to reduce biases in sensitivity of classification and novel-class
detection, e.g. optimise for equal probability of detecting new
birds/insects/anurans. (3) Investigate differing issues of desired and undesired novel classes, e.g. new species versus traffic noise. (4) Analyse large multi-year
acoustic datasets: recorded by NCACR in Czechia and MNHN in Jura/French Guiana, validating our algorithms and producing data-driven analyses for the partners. (5) Implement an “open-set” sound identification workflow that is compatible with public machine learning APIs. (6) Validate the workflow using synthetic scenarios, including both dataset-driven case studies and user-driven studies with novice and expert users. (7) Implement user-friendly tools to automate and enhance monitoring processes, demonstrating a live example of an AI pipeline that wildlife NGOs & GOs can adopt.
Layman's description
Sound recognition has come a long way and can be used to identify many animal species based on the sounds they produce. However, what about all the animals that haven't yet been recorded? Can machine learning also be used to help address this challenge and enhance our knowledge and understanding of the animals that surround us?
Status | Active |
---|---|
Effective start/end date | 1/04/24 → 31/03/28 |
Keywords
- Bioacoustics
- Machine Learning
- Ecology
- Anomaly Detection
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.