Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help in running prepare_text.sh for Wav2Vec2-U #5459

Open
Nemesis-19 opened this issue Mar 15, 2024 · 0 comments
Open

Need help in running prepare_text.sh for Wav2Vec2-U #5459

Nemesis-19 opened this issue Mar 15, 2024 · 0 comments

Comments

@Nemesis-19
Copy link

My Task: Reproducing the results of Wav2Vec2-U for LibriSpeech 960h Corpus:

I have created the train/valid/test.tsv files, for example:
/path/to/data/
3764-168670-0031.wav 131840
8455-210777-0060.wav 113200
7902-96592-0047.wav 74160
237-134500-0009.wav 35520

Next, removed silences using rVADFast and generated train/valid/test.vads files, for example:
5120:16960 17120:67520 71840:123360
3680:43200 47200:56320 57920:108320
3360:52640 59520:72800
1440:33600

Then created the .wrd and .ltr files for all 3, for example:
test.wrd:
WHY I COULD TIE YOU UP IN A KNOT AND HEAVE YOU OFF THE CLIFF ANY DAY WHAT A GAME
test.ltr:
W H Y | I | C O U L D | T I E | Y O U | U P | I N | A | K N O T | A N D | H E A V E | Y O U | O F F | T H E | C L I F F | A N Y | D A Y | W H A T | A | G A M E |

Now, the next step is creating .phn for all 3, for which I need to run the prepare_text.sh file.

Can someone please guide me in running this script, I am confused about what parameters to pass,
it takes: lg=$1, text_path=$2, target_dir=$3, min_phones=$4, phonemizer=$5, lid_path=$6, sil_prob=$7

From my own end, I know lg = en, target_dir = dir to save results, phonemizer = espeak and lid_path = lid.176.bin

I am unsure about the others, can someone please verify my steps until now and guide me in running this prepare_text.sh script?
(If everything is correct, then what do the parameters mean and what to pass in them)

Thanks and regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant