singapore-music-classifier

Source code accompanying the paper:

Nahar F., Agres K., BT B, Herremans D.. 2020. A dataset and classification model for Malay, Hindi, Tamil and Chinese music. 13th Workshop on music and machine learning (MML) as part of ECML/PKDD. Read paper

In this paper we present a new dataset, with musical excepts from the three main ethnic groups in Singapore: Chinese, Malay and Indian (both Hindi and Tamil). We use this new dataset to train different classification models to distinguish the origin of the music in terms of these ethnic groups. The classification models were optimized by exploring the use of different musical features as the input. Both high level features, i.e., musically meaningful features, as well as low level features, i.e., spectrogram based features, were extracted from the audio files so as to optimize the performance of the different classification models.

Code for web visualization tool is found on branch gh-pages
Code for cnn model found on cnn.py on main branch

Features

Low-level features

Low-level features are extracted using the opensource audio feature extranction tool, OpenSMILE

Feature Name	Definition
F0final	The smoothed fundamental frequency contour
voicingFinalUnclipped	The voicing probability of the final fundamental frequency candidate. Unclipped means,that it was not set to zero when is falls below the voicing threshold
jitterLocal	The local (frame-to-frame) Jitter (pitch period length deviations)
jitterDDP	The differential frame-to-frame Jitter (the ‘Jitter of the Jitter’)
shimmerLocal	The local (frame-to-frame) Shimmer (amplitude deviations between pitch periods)
logHNR	Log of the ratio of the energy of harmonic signal components to the energy of noise like signal components
audspec_lengthL1norm	Magnitude of L1 norm of Auditory Spectrum
audspecRasta_lengthL1norm	Relative Spectral Transform applied to Auditory Spectrum and lengthL1norm is the magnitude of the L1 norm
pcm_RMSenergy	Root-mean-square signal frame energy
pcm_zcr	Zero-crossing rate of time signal (frame-based)
audSpec_Rfilt (0 ~ 25)	Relative Spectral Transform (RASTA)-style filtered applied to Auditory Spectrum
pcm_fftMag_fband250-650	fft magnitude of frequency band between 250Hz to 650Hz
pcm_fftMag_fband1000-4000	fft magnitude of frequency band between 1000Hz to 4000Hz
pcm_fftMag_psySharpness	Psychoacoustic sharpness
pcm_fftMag_spectralHarmonicity	Spectral Harmonicity
pcm_fftMag_mfcc(1 ~ 14)	Mel-frequency cepstral coefficients 1–14
pcm_fftMag_spectralRollOff25.0	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralRollOff50.0	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralRollOff75.0	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralRollOff90.0	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralFlux	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralCentroid	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralEntropy	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralVariance	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralSkewness	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralKurtosis	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform
pcm_fftMag_spectralSlope	Spectral Features (frequency based features) which are obtained by converting time-based signal into frequency domain using the Fourier Transform

High-level features

High-level features are extracted by using Essentia Python interface

Descriptor type	Feature name	Property	Algorithm	Definition
Rhythm	beats per minute (bpm)		Rhythm Extractor2013	beats per minute - the tempo estimation
Rhythm	ticks	Temporal	Rhythm Extractor2013	the estimated tick locations
Rhythm	bpmintervals (Beats per minute intervals)	Temporal	Rhythm Extractor2013	list of beats intervals
Rhythm	loudness	Temporal	Beats Loudness	the beat´s energy in the whole spectrum
Rhythm	danceability		Danceability	the danceability value. Normal values range from 0 to 3. The higher, the more danceable
Rhythm	Detrended Fluctuation Analysis (DFA)		Danceability	the Detrended Fluctuation Analysis (DFA) exponent vector for considered segment length (tau) values
Tonal	key		KeyExtractor	the estimated key, from A to G
Tonal	scale		KeyExtractor	the scale of the key (major or minor)
Tonal	strength		Strength	the strength of the estimated key
Tonal	chords_changes_rate		TonalExtractor	the rate at which chords change in the progression
Tonal	chords_histogram	Temporal	TonalExtractor	the normalized histogram of chords
Tonal	chords_key		TonalExtractor	the most frequent chord of the progression
Tonal	chords_number_rate		TonalExtractor	the ratio of different chords from the total number of chords in the progression
Tonal	chords_scale		TonalExtractor	the scale of the most frequent chord of the progression (either ’major’ or ’minor’)
Tonal	chords_strength	Temporal	TonalExtractor	the strength of the chord
Tonal	key_key		TonalExtractor	the estimated key, from A to G
Tonal	key_scale		TonalExtractor	the scale of the key (major or minor)
Tonal	key_strength		TonalExtractor	the strength of the estimated key
Tonal	tuningFrequency	Temporal	Tuning-Frequency-Extractor	the computed tuning frequency
Spectral	frequencies	Temporal	SpectralPeaks	the frequencies of the spectral peaks [Hz]
Spectral	magnitude	Temporal	SpectralPeaks	the magnitudes of the spectral peaks
Pitch	salienceFunction	Temporal	PitchSalience-Function	array of the quantized pitch salience values
Pitch	salienceBins	Temporal	PitchSalience-FunctionPeaks	This algorithm computes the peaks of a given pitch salience function
Pitch	salienceValues	Temporal	PitchSalience-FunctionPeaks	salience values corresponding to the peaks
Loudness	dynamicComplexity		DynamicComplexity	the dynamic complexity coefficient
Loudness	estimate-loudness		DynamicComplexity	an estimate of the loudness [dB]
Loudness	intensity		Intensity	the intensity value
Loudness	larm		Larm	This algorithm estimates the long-term loudness of an audio signal.
Loudness	leq		Leq	This algorithm computes the Equivalent sound level (Leq) of an audio signal
Loudness	loudness (loud algorithm)	Temporal	LevelExtractor	This algorithm extracts the loudness of an audio signal in frames using Loudness algorithm
Loudness	loudness (Steven power algorithm)		Loudness	This algorithm computes the loudness of an audio signal defined by Steven’s power law

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
spectrogram-images		spectrogram-images
.DS_Store		.DS_Store
Graph of key.ipynb		Graph of key.ipynb
README.md		README.md
cnn.py		cnn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spectrogram-images

spectrogram-images

.DS_Store

.DS_Store

Graph of key.ipynb

Graph of key.ipynb

README.md

README.md

cnn.py

cnn.py

Repository files navigation

singapore-music-classifier

Features

Low-level features

High-level features

About

Releases

Packages

Contributors 2

Languages

AMAAI-Lab/singapore-music-classifier

Folders and files

Latest commit

History

Repository files navigation

singapore-music-classifier

Features

Low-level features

High-level features

About

Topics

Resources

Stars

Watchers

Forks

Languages