We create a public repository for Robust Automatic Transcription of Speech (RATS) Channel-A Speech Data, which is a chargeable dataset under LDC (official website). Here we release its Log-Mel Fbank features and several raw wavform listening samples.
Google Drive Address: https://drive.google.com/drive/folders/118JBm0B5txiS7q09SX_xlbf2glbad0jA?usp=sharing
Where you will find a README.md for instrcutions.
Here we select one raw wavform sample from train set, valid set and test set, respectively. And we also present their parallel clean samples to show how noisy the RATS Channel-A corpus is.
(Note: since Github README.md does not support .wav format, we transfer the raw .wav files to .mp4 format for presentation here, one can turn to ./samples for raw files)
a. noisy speech:
train_noisy.mp4
b. clean speech:
train_clean.mp4
a. noisy speech:
valid_noisy.mp4
b. clean speech:
valid_clean.mp4
a. noisy speech:
test_noisy.mp4
b. clean speech:
test_clean.mp4
[1] D. Graff, K. Walker, S. M. Strassel, X. Ma, K. Jones, and A. Sawyer. “The rats collection: Supporting hlt researchwith degraded audio data.,” in LREC.Citeseer, 2014.