Speech Accent Detector

Motivation

We wanted to create a project that could predict the country a person was raised in based on their speech accents. The idea behind this project is that most people develop distinctive linguistic patterns from the countries they grow up speaking in. Specifically, we want to identify noticeable accent patterns on English speech, and in the long term we hope that our project can be used to improve communication between people across various cultures and countries.

Data and Preprocessing

Dataset

Official Kaggle Speech Accent Archive Categories:

Arabic
English
French
Spanish
Russian

Mel Frequency Cepstral Coefficient (MFCC)

Primary preprocessing technique involves converting the wav to MFCC features. Wav file sampled at rate of 8000, creating an array of 12 MFCC values.

Model Architectures

Random Forest
1D Conv Net

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
praat		praat
utils		utils
.gitignore		.gitignore
README.md		README.md
Speech Accent Detector.ipynb		Speech Accent Detector.ipynb
convert_to_wav.sh		convert_to_wav.sh
model.th		model.th
models.py		models.py
requirements.txt		requirements.txt
results.txt		results.txt
run.py		run.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

praat

praat

utils

utils

.gitignore

.gitignore

README.md

README.md