Skip to content

Latest commit

 

History

History
50 lines (38 loc) · 1.9 KB

README.md

File metadata and controls

50 lines (38 loc) · 1.9 KB

Voice Toolbox

The place to solve all your audio signal processing needs.

The current repo is under construction. Goal is to create a repository that contains all voice signal processing functions available from different open source projects and libraries, such as parsel mouth and librosa.

Files

To start: Setup a conda environment and run 'pip3 install -r requirements.txt' before running the available scripts.

Important: if you get an error with parselmouth make sure the installation is 'pip3 install praat-parselmouth'


The script for extracting features is parsel_process.py.

  • To run: "python3 feature_extraction.py [sampling rate] [filepath] [output filepath] --[feature flag]"

feature flags: formants, ZCR, harmonics, rate_of_speech, loudness, pitch_features, spectral_features, energy

Features currently availabe:

  1. Spectral Features:
  • pitch
  • pitch range
  • spectral slope
  • mel-frequency cepstral coefficients (MFCC)
  • mean spectral roll-off
  • median F0 (fundamental frequency)
  1. Rate of Speech and loudness:
  • max intensity
  • mean intensity
  • syllables per second
  • pause rate
  • energy
  1. Harmonics
  • harmonics to noise (HNR)
  • Formants: f1,f2, f3, f4
  • number of zero crossings (ZCR)

Extra Scripts for processed features

For visualization:

  1. visualize_voice.py for all scatter plots along with other plotting features from praat.
  • To run: 'python3 visualize_voice.py'
  1. radar_plot.py for all radar plots
  • To run: 'python3 radar_plot.py'

For PCA analysis of voice data: voice_pca.py is for PCA, RFE and Correlation plot:

    • To run: 'voice_pca.py'