The MiLe End Hums and Whistles Machine Learning Project

This year at Queen Mary University of London we are going to create a new dataset consisting of labelled audio recordings. Each audio recording will consist of a unique interpretation of a small fragment of 8 iconic movie song.

We will consider fragments of approximately 15 seconds of duration from 8 songs. The name of the songs, the label we will use to identify them (in parenthesis, bold font) and an link to an online resource where you can listen to them are listed below:

Harry Potter theme song (Potter) https://youtu.be/Htaj3o3JD8I?t=0
The Imperial March (StarWars) https://youtu.be/s3SZ5sIMY6o?t=9
Pink Panther theme song (Panther) https://youtu.be/lp6z3s1Gig0?t=10
Singing in the rain (Rain) https://youtu.be/D1ZYhVpdXbQ?t=65
Hakuna Matata (Hakuna) https://youtu.be/MBIWFTXQbi4?t=79
Mamma Mia (Mamma) https://youtu.be/unfzfe8f9NI?t=50
This is me (Showman) https://youtu.be/CjxugyZCfuw?t=115
Let it go (Frozen) https://youtu.be/L0MK7qz13bU?t=126

Data Interpretations

We will record two types of interpretations of the above mentioned songs:

Humming.
Whistling.

There is no right or wrong way of humming or whistling to a song. When recording ourself, we just hum or whistle as you would normally do (da-da-da, la-la-la, hm-hm-hm, ti-ro-ri, pa-rapa…). We did not sing the lyrics.

Jupyter Notebooks

Basic solution :

Using the MLEnd Hums and Whistles dataset, build a machine learning pipeline that takes as an input a Potter or a StarWars audio segment and predicts its song label (either Harry or StarWars).

Underline Steps:

Importing required python libraries
Data Cleaning Function
Reading and processing Harry Potter and Starwars audio files
Merging and creating final dataframe
Feature Extraction from the audio: Power, Pitch Mean, Pitch Std., Voice Frame, Interpretation Label, Song Label
Data Exploration, Data Normalization, Data Split
Dummy check for Humming and whistling classification.
Model 1: SVM classifier for classifying Harry Potter or Starwars files
Analysing the results:
- Training Accuracy: 0.6840277777777778
- Validation Accuracy: 0.5874439461883408
- Testing Accuracy:0.56

We can improve the model by including advance features of adio processing like mfcc, chroma, melody (to be included in the advance solution)

Advanced solution :

An advanced Machine Learning solution to identify different audio files

Underline Steps:

Data Processing of 7 songs
Feature Extractions: Previously we used following features from the audio data:

Power, Pitch Mean, Pitch Std., Voice Frame, Interpretation Label, Song Label

Advance features which we have added are:

MFFC, Chroma, Mel-fre, Contrast

Feature scaling using z-scoreS
Model 1: Modified SVM Model

Training Accuracy 0.5252725470763132

Validation Accuracy 0.38125802310654683

Testing Accuracy 0.39080459770114945

Model 2: CNN

Training Accuracy: 0.9856293201446533

Validation Accuracy: 0.43132221698760986

Testing Accuracy 0.41379310344827586

Unsupervised Gender Classification using hierarcial clustering based on mfcc feature of the audio files. SVM Model: _Referenced Paper: Gender Identification using MFCC for Telephone Applications – A Comparative Study

Common approaches for gender recognition are based on the analysis of pitch of the speech. However, gender recognition using a single feature is not sufficiently accurate for a large variety of speakers. To capture differences in both time domain and frequency domain, a set of features known as Mel-frequency cepstrum coefficients (MFCC) are used. These are widely used state-of-the-art features for automatic speech and speaker recognition. MFCC features are extracted from speech signals over a small window of 20 to 40 milliseconds. These features are also known to work efficiently in noisy environments. Due to their robust nature, they are widely used in speaker recognition tasks

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitattributes		.gitattributes
README.md		README.md
advance_solution.ipynb		advance_solution.ipynb
basic_solution.ipynb		basic_solution.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitattributes

.gitattributes

README.md

README.md

advance_solution.ipynb

advance_solution.ipynb

basic_solution.ipynb

basic_solution.ipynb

Repository files navigation

The MiLe End Hums and Whistles Machine Learning Project

Data Interpretations

Jupyter Notebooks

About

Releases

Packages

Languages

antra0497/MLE-humming-and-whistling

Folders and files

Latest commit

History

Repository files navigation

The MiLe End Hums and Whistles Machine Learning Project

Data Interpretations

Jupyter Notebooks

About

Topics

Resources

Stars

Watchers

Forks

Languages