Skip to content

A script extracting features of emotionally charged speech

Notifications You must be signed in to change notification settings

zoobereq/emotional_speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Emotional Speech Analyzer

A script extracting audio features from emotionally charged speech.

The following features are extracted:

  • Pitch (f0 min, f0 max, and f0 mean) measured in Hz
  • Intensity (min, max, and mean) measured in dB
  • Jitter
  • Shimmer
  • Speaking rate (number of words \ duration)

The script uses Parselmouth, a python wrapper for Praat.

Notes

  • For pitch extraction, set pitch floor is set to 75Hz, and pitch ceiling to 600Hz.
  • Only local jitter is extracted. Period floor is set to 0.0001s, period ceiling to 0.02s, and maximum period factor to 1.3.
  • Only local shimmer is extracted. Period floor is set to 0.0001s, period ceiling to 0.02s, maximum period factor to 1.3, and maximum amplitude factor to 1.6.
  • HNR (harmonics-to-noise ratio) is calculated based on harmonicity. For harmonicity time step is set to 0.01, minimum pitch to 75Hz, silence threshold to 0.1, and number of periods per window to 1.0.

Data

A good source of audio data for emotional speech analysis is the MSP-Podcast corpus. A handful of adapted examples can be downloaded with the included Bash script. The examples consists of seven files, each of which illustrates a different emotional state. The transcripts of the sample files are attached.

Releases

No releases published

Packages

No packages published