Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tips / recipe guidelines - FLEURS xtreme_s or others #1194

Open
flutter-painter opened this issue Oct 20, 2023 · 6 comments
Open

tips / recipe guidelines - FLEURS xtreme_s or others #1194

flutter-painter opened this issue Oct 20, 2023 · 6 comments

Comments

@flutter-painter
Copy link

Could you please provide a recipe for FLEURS / xtreme_s ?

@pzelasko
Copy link
Collaborator

Could you take a look at the existing recipes in lhotse/recipes and see if you can use any of them as a basis to write your own? If you'd be willing to make a PR, I'm happy to review it.

@flutter-painter
Copy link
Author

Thank you for your quick reply,
I looked but lack practice yet
I am gathering more audio first and will dig deeper in a few months

@flutter-painter flutter-painter changed the title tips / recipe for FLEURS xtreme_s ? tips / recipe guidelines - FLEURS xtreme_s or others Jan 13, 2024
@flutter-painter
Copy link
Author

Hi,
I looked again at the recipes but stay puzzled.
I also edited the issue name since there is a wider dataset for fula that goes beyond fleurs here :

It is structured in three fields : audio, transcription, dialect
Are there any guidelines as to which recipe could be used this kind of dataset ?
Should I try them randomly until I find one that looks similar to the dataset I intend to use ?

@desh2608
Copy link
Collaborator

A "recipe" in Lhotse simply means a script that creates standard manifests (recording manifest and supervisions manifest out of any dataset of your choice. You can take a look at lhotse/recipes to see several examples of how this is done, and then write a recipe yourself for the FLEURS dataset.

@flutter-painter
Copy link
Author

Hi Desh,
Thank you for pointing me again at the same link. Could you please name one recipe that you consider a good starting point ?

@desh2608
Copy link
Collaborator

You can look at the AISHELL recipe for an example. But keep in mind that each recipe will be different depending on how your data is structured. If you are familiar with Kaldi, think of this as the prepare_data.sh scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants