Audio Based Machine Learning with Python Assignment I
Ankit Bahuguna edited this page Apr 26, 2015
·
5 revisions
Generation of Convoluted Speech File for a single input speech file.
Generates the Convoluted speech files in a given "output" directory, with customized command line parameters.
Usage: beta_script.py <pathtodir> <referenceFile> <noiseFile> <output_root_directory>
Example Run (On Test Server, Sample is available at /mnt/alderaan/mlteam3/A1): python beta_script.py ../data/sample/ ref_pink.wav super_market_mall2.wav ~/output/
Sample Output (Before dividing by 65536.0):
Current File: SA2_632.wav
6.0
9.9
6.0
SNR_ratio: 8
loudness of speech: -31.262838506
loudness of noise: -25.613592796
gain: 13.64924571
loudness of adjusted speech: -17.6134095363
check SNR: 8.0001832597
Final output file path: /mnt/alderaan/mlteam3/output/sample/convoluted/SA2_632.wav
MFCC Shape:
(262, 13)
[[ 21.36008246 4.68758863 -21.02879502 ..., -26.48962835 6.03351536
-13.4272054 ]
[ 20.31307126 -1.92123373 -4.03121067 ..., -32.55583596 -1.99548691
-12.51953057]
[ 20.33447627 -4.48978457 -1.50634631 ..., -37.70311405 -2.25513418
-6.88467421]
...,
[ 19.8452419 1.90521342 -7.74598993 ..., -25.54148603 10.53801712
-15.36899988]
[ 19.67552457 -2.96859201 -9.66682243 ..., -20.83320987 9.90071342
-10.93574823]
[ 19.18471866 -2.01026595 -9.93172322 ..., -27.09888399 -0.2349532
-10.09651341]]
Operations Finished!
Sample Output (After dividing by 65536.0):
mlteam3@pcschlichter4:~/A1$ python beta_script.py ../data/sample/ ref_pink.wav super_market_mall2.wav ~/output/
Current File: SA2_632.wav
6.0
9.9
6.0
SNR_ratio: 3
loudness of speech: -31.262838506
loudness of noise: -24.2493999642
gain: 10.0134385417
loudness of adjusted speech: -21.2487508768
check SNR: 3.00064908746
Final output file path: /mnt/alderaan/mlteam3/output/sample/convoluted/SA2_632.wav
MFCC Shape:
(262, 13)
[[ 19.59508939 2.70186776 -9.23936962 ..., -34.90854563 -6.3798238
-6.01107859]
[ 19.55317435 1.11851098 -9.86874784 ..., -27.75712167 1.04552466
-1.52660555]
[ 19.57381244 3.3010208 -8.2860773 ..., -25.3561603 8.64244945
0.60929827]
...,
[ 19.22814775 4.87784517 -6.85796621 ..., -27.2970513 -6.59483469
-13.7599967 ]
[ 19.16051337 7.38684259 -7.06352436 ..., -19.57186845 -0.52468905
-9.31084754]
[ 18.98050344 7.67656769 -6.82664631 ..., -22.64922645 -7.03116873
-13.19350393]]
Operations Finished!
Generates the Convoluted speech files in a given "output" directory, with customized command line parameters.
Usage: gamma_script.py <speaker_speech_dir> <reference_file> <noise_file_dir> <ir_noise_file_dir> <output_root_directory>
Command line arguments:
<speaker_speech_dir>: Root directory where all the VAD speaker speech data is located
<reference_file>: Reference File for the Calculation of Replay gain, named as "ref_pink.wav"
<noise_file_dir>: Root directory where all the noise data is located
<ir_noise_file_dir> : Root directory where all the impulse response noise samples is located
<output_root_directory>: Root directory for storing all the output files in same directory
hierarchy as the input speech directory
Sample Run and Output:
mlteam3@pcschlichter4:~/A1$ python gamma_script.py ../data/sample ref_pink.wav
../data/noise_sample/ /mnt/tatooine/data/impulse_responses/16kHz/wavs16b ../newout1
Noise File Path: ../data/noise_sample/super_market_mall2.wav
IR-Noise File Path: /mnt/tatooine/data/impulse_responses/16kHz/wavs16b/s1_desk.wav
Final output file path: ../newout1/sample/SA2_632.wav
Operations Finished!
mlteam3@pcschlichter4:~/A1$ python gamma_script.py ../data/sample ref_pink.wav
../data/noise_sample/ /mnt/tatooine/data/impulse_responses/16kHz/wavs16b ../newout1
Noise File Path: ../data/noise_sample/super_market_mall2.wav
IR-Noise File Path: /mnt/tatooine/data/impulse_responses/16kHz/wavs16b/s3_desk.wav
Final output file path: ../newout1/sample/SA2_632.wav
Operations Finished!
mlteam3@pcschlichter4:~/A1$ python gamma_script.py ../data/sample ref_pink.wav
../data/noise_sample/ /mnt/tatooine/data/impulse_responses/16kHz/wavs16b ../newout1
Noise File Path: ../data/noise_sample/super_market_mall2_copy.wav
IR-Noise File Path: /mnt/tatooine/data/impulse_responses/16kHz/wavs16b/lg_leather_bag.wav
Final output file path: ../newout1/sample/SA2_632.wav
Operations Finished!