Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] LAMB optimizer #1460

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

francoishernandez
Copy link
Member

@francoishernandez francoishernandez commented Jun 5, 2019

[DO NOT MERGE]

This is a WIP on implementing LAMB optimizer from BERT. It apparently allows to scale training on huge batches. There are some ambiguities : different algorithms between v1 and v2/v3 of the paper, some blurry definitions and no official implementation yet (a few ones are out there but differ on a few points), no clear learning_rate schedule in the paper despite detailed experiments, etc.
Also, there might be some significant tuning to do in order to find appropriate values for our tasks.
I open this PR for future work, when we'll have more elements.

The current version here is based on https://github.com/cybertronai/pytorch-lamb, which itself is based on torch.optimizers.Adam.

@alphadl
Copy link
Contributor

alphadl commented Jul 16, 2019

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants