Skip to content

Classifying audio files into 10 music genres using traditional ML models & Deep ANN model

Notifications You must be signed in to change notification settings

arpithaananth/Music-Genre-Classification-From-Audio-Files

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Music Genre Classification From Audio Files

Objective:

  • To analyse & extract the features from Audio Files
  • To implement deep artificial neural network to classify the genres
  • To implement traditional machine learning algorithms for classification

Data Set

The GTZAN Genre Collection dataset is used to develop genre classification algorithm. The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres, each represented by 100 tracks. The tracks are all 22050Hz Mono 16-bit audio files in .wav format. The size of the dataset is 1.2GB

Data Set Source

The dataset has been procured from Marsyas (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing with specific emphasis on Music Information Retrieval applications.

A Brief Description about features in Audio Files:

- Spectrogram:

A visual representation of frequencies over time image image

- Zero Crossing Rate:

Rate at which the signal changes signs (that is positive to negative) image

- Spectral Centroid:

This parameter indicates where the center of mass of the signal is located image

- Spectral Rolloff:

The measure of shape of signal, frequency below which a specified percentage of the total spectral energy image

- Mel- Frequency Cepstral Coefficients:

Small set of features which describe the shape of spectral signal image

- Chroma Frequencies:

Describes entire spectral in 12 distinct semitones of musical octave image

Implementation of Artificial Neural Network:

Deep Neural Network was implemented, the test accuracy of 0.665 was achieved

Implementation of Traditional Machine Learning Models for Multi-Class Classification:

ROC-AUC Scores of Classifiers:

  • Logistic Regression: 0.8949
  • KNN Classifier: 0.8301
  • Decision Tree: 0.7060
  • Random Forest: 0.8441
  • Ada Boost Classifier: 0.5515
  • Gradient Boost Classifier: 0.8406
  • XG Boost Classifier: 0.8406

image

References:

About

Classifying audio files into 10 music genres using traditional ML models & Deep ANN model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published