Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default model sizes are much smaller than BERT base #81

Open
bertmaher opened this issue Aug 28, 2020 · 0 comments
Open

Default model sizes are much smaller than BERT base #81

bertmaher opened this issue Aug 28, 2020 · 0 comments

Comments

@bertmaher
Copy link

The base BERT model in https://arxiv.org/pdf/1810.04805.pdf uses 768 hidden features, 12 layers, 12 heads (which are also the defaults in bert.py), while the default configuration in the argparser of __main__.py uses 256/8/8. Would it make sense to align the example script with the paper? I spent quite a while puzzling over my low GPU utilization with the default configuration. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant