ffmpeg_filters_mse

Calculates and visualizes the temporal domain and frequency domain mean squared error of ffmpeg audio filters.

Setup

Install ffmpeg
Install required Python packages ( pip3 install -r requirements.txt )
Create original_audio directory
Put audio in original_audio directory

Usage

Run convert.sh if audio files are not in .wav format
Run apply_filters.py to create filtered audio files
Run mse.py to calculate the mean square error for each filter
Run visualize.py to visualize the results as bar graphs

Repository Files

convert.sh

Converts any audio files in original_audio into a .wav file using default ffmpeg conversion and deletes originals.

Usage

./convert.sh

filters.json

JSON formatted list of ffmpeg audio filters

config.json

Configuration file for apply_filters.py and mse.py

key	default	description
filters_filename	filters.json	filename of JSON formatted filters list in
segment_len	262144	number of audio samples in each analyzed segment
sample_skips	262144	number of samples skipped between beginnings of analyzed segments
bit_depth	16	bit depth of analyzed audio
original_audio_dir	original_audio	relative path to search for original audio
filtered_audio_dir	filtered_audio	relative path of filtered audio
output_filename	output.json	filename of JSON formatted mean square error output

config.py

Defines CONFIG_FILENAME, Config class, and associated JSON loader function (load_config).

apply_filters.py

Loads configuration from CONFIG_FILENAME, applies list of ffmpeg audio filters from filters_filename to .wav files in original_audio_dir and writes resulting audio files to filtered_audio_dir.

Usage

python3 apply_filters.py

mse.py

Loads configuration from CONFIG_FILENAME and calculates the average MSE of sequences of length sequence_len in the temporal domain and frequency domain (DCT-II) between original audio segments and their filtered counterparts. Resulting MSEs are dumped to output_filename in JSON format.

Usage

python3 mse.py

visualize.py

Loads configuration from CONFIG_FILENAME, read MSE outputs from output_filename and plot the results as bar graphs.

Example Results

The following results were calculated from 3 hours of audio extracted from a Twitch VOD.

filter	MSE (temporal domain)	MSE (frequency domain)
acompressor	419.5447047722049	769325.3616135248
acrusher	128.31195087665463	788.4744883700115
aecho	1973.808181613829	11476890.952585308
aphaser	2140.157159476164	7514830.79328153
alimiter	1589.4807644937096	33103402.4035865

Temporal Domain MSE

Frequency Domain MSE

TODO

Add more audio filters
Add better documentation for example results

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
figures		figures
README.md		README.md
apply_filters.py		apply_filters.py
config.json		config.json
config.py		config.py
convert.sh		convert.sh
filters.json		filters.json
mse.py		mse.py
requirements.txt		requirements.txt
visualize.py		visualize.py

hrichharms/ffmpeg_filters_mse

Folders and files

Latest commit

History

Repository files navigation

ffmpeg_filters_mse

Setup

Usage

Repository Files

convert.sh

Usage

filters.json

config.json

config.py

apply_filters.py

Usage

mse.py

Usage

visualize.py

Example Results

Temporal Domain MSE

Frequency Domain MSE

TODO

About

Topics

Resources

Stars

Watchers

Forks

Languages