GitHub - dorajam/Voice-detection: Pitch detection with neural networks + implementation of MFCCs

Sound recognition with neural networks

This was an experiment to see how well the Mel Frequency Cepstral Coefficients (MFCC’s) and Chroma analysis are doing in extracting features from audio signals. To do this, I wanted to detect whether some song is by Chet Baker or Beyonce - clearly two very different genres. This turned out horribly difficult to accomplish, so I moved onto simpler data and used raw audio books to detect if some voice is of a women or man. Next, I transformed the preprocessed audio snippets, and fed them all into a neural network to classify different pitches.

see the soundTransformation directory for my implementation of the Mel Frequency Cepstral Coefficients (MFCC’s). For most of the transformation I used the Python Librosa library
see all preprocessing in the music directory
run chromogram.py to get the cleaned up sound input from sound_input.wav -> this contains a data array of 1 second CQT transformed clips.
run network by test.py

DONE

cleaning up and processing of two audiobooks with male & female voices (concatenated two files, trimmed to equal lengths, removed silence below 20 decibels)
implementation of the sound transformation both with MFCCs, and CQT
placed 1 second clips of the audio data into a numpy array
network all set up -> 99% accuracy when testing on people from the training sample
84% accuracy when testing on people different from the training sample

Ideas for the future

For robustness: train on a new data array with a bigger variety of females/male voices -> problematic as there is no data In terms of trying different models, for more complex tasks, a recurrcent neural network would be more appropriate, so an idea could be to try that with on a larger dataset -> again, tough to get a larger sample.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.ipynb_checkpoints		.ipynb_checkpoints
music		music
network		network
soundTransformation		soundTransformation
.gitignore		.gitignore
README.md		README.md
helloLibrosa.py		helloLibrosa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

music

music

network

network

soundTransformation

soundTransformation

.gitignore

.gitignore

README.md

README.md

helloLibrosa.py

helloLibrosa.py

Repository files navigation

Sound recognition with neural networks

DONE

Ideas for the future

About

Releases

Packages

Contributors 2

Languages

dorajam/Voice-detection

Folders and files

Latest commit

History

Repository files navigation

Sound recognition with neural networks

DONE

Ideas for the future

About

Resources

Stars

Watchers

Forks

Languages