Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the number of MLS model parameters and polish dev loss curve fluctuations #1022

Open
CriDora opened this issue Apr 1, 2024 · 0 comments
Open
Labels

Comments

@CriDora
Copy link

CriDora commented Apr 1, 2024

Question

I have read the paper and have some questions. In thisr paper MLS: A LARGE-SCALE MULTILINGUAL DATASET FOR SPEECH RESEARCH, a 36-layers transformer was used to train the monolingual model. I would like to know the model size. A 1GB acoustic model is provided in the mls folder, but I want to know the number of parameters of the model. Besides, when reproducing the monolingual results in this paper for Polish, the dev loss always fluctuate seriously, but this did not happen in Portuguese and Italian. Even after adjusting the learning rate, it will still fluctuate. When I shuffle the order of train and dev and redistribute the two datasets, the loss of dev can converge well. How can I check the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant