New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Loss drops, model still produces gibberish? #23
Comments
@MichelNivard try training it now and see what happens, I've made many optimizations |
Okay digging into it later today, thanks! |
Hi, I trained model using train.py script to completion, although I used a larger batch size and less epochs due to different GPU usued for training.
However the model produces gibberish
The validation line was
|
Could we add proper checkpointing to the training loop in train.py? I've tried torch.save({}), but the model can't be opened with Netron for validation. I'm missing something obviously .. |
Describe the bug
After 5300 iteraitons loss near 2.7, is it still supposed to spit out near giberish?
To Reproduce
Running on CPU, macbookkair M2, omitting the model.cuda() line
Expected behaviour
Some kind of convergence on sentences that are at least english-ish?
Screenshots
Additional context
Maybe my expectations are just off and I should train way way more?
The text was updated successfully, but these errors were encountered: