This project aims to give a shot at the accoustic scene classification based on audio recordings. It is inspired by the paper 'A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification' : https://arxiv.org/abs/1806.07506v2 .
Here I'm only looking at the Deep LEarning part for now.
The audios recording I'm working on, not available yet, are audio recordings of the submarine environment of differents spots around Dakar. I've got segmentation files that denotes when a fish is heard singing or a shrimp is clicking (as labels). The main goal is to retrieve the fish and shrimp activity on any audio recording and further ones thanks to a deep learning model. This could be declined to a lot of other similar applications.