Skip to content

ehughson/voice_toolbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice Toolbox

The place to solve all your audio signal processing needs.

The current repo is under construction. Goal is to create a repository that contains all voice signal processing functions available from different open source projects and libraries, such as parsel mouth and librosa.

Files

To start: Setup a conda environment and run 'pip3 install -r requirements.txt' before running the available scripts.

Important: if you get an error with parselmouth make sure the installation is 'pip3 install praat-parselmouth'


The script for extracting features is parsel_process.py.

  • To run: "python3 feature_extraction.py [sampling rate] [filepath] [output filepath] --[feature flag]"

feature flags: formants, ZCR, harmonics, rate_of_speech, loudness, pitch_features, spectral_features, energy

Features currently availabe:

  1. Spectral Features:
  • pitch
  • pitch range
  • spectral slope
  • mel-frequency cepstral coefficients (MFCC)
  • mean spectral roll-off
  • median F0 (fundamental frequency)
  1. Rate of Speech and loudness:
  • max intensity
  • mean intensity
  • syllables per second
  • pause rate
  • energy
  1. Harmonics
  • harmonics to noise (HNR)
  • Formants: f1,f2, f3, f4
  • number of zero crossings (ZCR)

Extra Scripts for processed features

For visualization:

  1. visualize_voice.py for all scatter plots along with other plotting features from praat.
  • To run: 'python3 visualize_voice.py'
  1. radar_plot.py for all radar plots
  • To run: 'python3 radar_plot.py'

For PCA analysis of voice data: voice_pca.py is for PCA, RFE and Correlation plot:

    • To run: 'voice_pca.py'

About

A toolbox for the extraction of many voice related features!

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages