FSD50K_baseline

This repository will contain the code for the baseline experiments included in the following paper. If you use this code or part of it, please cite:

Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, Xavier Serra, "FSD50K: an Open Dataset of Human-Labeled Sound Events", arXiv:2010.00475, 2020.

This repository will contain a framework that comprises all the basic stages in supervised sound event classifcation: feature extraction, training, inference and evaluation. After loading the FSD50K dataset, log-mel energies are computed and several baselines can be trained and evaluated. Please check our paper for more details. The system is implemented in TensorFlow.

Code will be made available in the final version of the paper, hopefully before the end of 2021. We will announce when it is available via twitter and via the freesound-annotator Google Group.

In the meantime, make sure to take a look at the resources we just released:

the FSD50K dataset can be downloaded from Zenodo: http://doi.org/10.5281/zenodo.4060432
our journal preprint describes in depth the creation of the dataset, as well as its characterization and baseline experiments
the FSD50K's companion site, where you can inspect the audio content of the dataset per category: https://annotator.freesound.org/fsd/release/FSD50K/

Stay tuned!

Reference

@article{fonseca2020fsd50k,
  title={{FSD50K}: an Open Dataset of Human-Labeled Sound Events},
  author={Fonseca, Eduardo and Favory, Xavier and Pons, Jordi and Font, Frederic and Serra, Xavier},
  journal={arXiv preprint arXiv:2010.00475},
  year={2020}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

FSD50K_baseline

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

FSD50K_baseline

Reference