Skip to content

Commit

Permalink
MMS alignment README fixes (#5432)
Browse files Browse the repository at this point in the history
* Mention sox install through apt, on top of the Python wrapper
* Fix argument name in example command
  • Loading branch information
raphaelmerx committed Jan 24, 2024
1 parent fad2c4d commit 3f0f20f
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions examples/mms/data_prep/README.md
Expand Up @@ -14,8 +14,8 @@ We describe the process of aligning long audio files with their transcripts and

- Step 3: Install a few other dependencies
```
pip install sox
pip install dataclasses
apt install sox
pip install sox dataclasses
```

- Step 4: Create a text file containing the transcript for a (long) audio file. Each line in the text file will correspond to a separate audio segment that will be generated upon alignment.
Expand All @@ -29,7 +29,7 @@ We describe the process of aligning long audio files with their transcripts and

- Step 5: Run forced alignment and segment the audio file into shorter segments.
```
python align_and_segment.py --audio /path/to/audio.wav --textfile /path/to/textfile --lang <iso> --outdir /path/to/output --uroman /path/to/uroman/bin
python align_and_segment.py --audio /path/to/audio.wav --text_filepath /path/to/textfile --lang <iso> --outdir /path/to/output --uroman /path/to/uroman/bin
```

The above code will generated the audio segments under output directory based on the content of each line in the input text file. The `manifest.json` file consisting of the of segmented audio filepaths and their corresponding transcripts.
Expand Down

0 comments on commit 3f0f20f

Please sign in to comment.