Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results from word_language_model should be more organized #1186

Open
NoahSchiro opened this issue Aug 25, 2023 · 1 comment
Open

Results from word_language_model should be more organized #1186

NoahSchiro opened this issue Aug 25, 2023 · 1 comment

Comments

@NoahSchiro
Copy link
Contributor

Is your feature request related to a problem? Please describe.

When we run word_language_model/main.py, we can select between a variety of models for the specific task at hand. However, when we generate the model and save the weights of training, we just dump out a generic "model.pt".

Similarly, the results of generate.py spit out a generic generated.txt file.

Users have the option to change this in command line, but the default is a generic file name.

Describe the solution

I think it would be beneficial (for the purposes of comparing models) to write out to a "transformer.pt" or "lstm.pt" so that they are separate files and analysis can be run on multiple models after training.

Similarly with the generated txt file, instead of generated.txt, the default should be the same as the name of the model ([model]_gen.txt)

All of this would also be better organized if put into a "results/" subdirectory within the word_language_model directory.

Describe alternatives solution

I am open to hearing other ideas, or an argument for why a generic name is preferred as the default. It might also be useful to be able to distinguish between models of the same architecture but different hyper-parameters, though this could also result in very long file names. A possible middle ground is to just also output the text that is generated during training so there is a log of batch sizes / how loss changes over time / etc.

I can create a PR in short order if this is deemed a valid change by the package maintainers.

@KossaiSbai
Copy link

Hey there, I agree with your overall suggestion. I believe the file name should just indicate the model name and any other details can be included in the TXT file itself or even another file if required. Also please note that the model info itself would be stored in the pt file itself.

I am happy to take a look and submit a PR implementing those suggestions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants