Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"gas" configuration doesn't do anything #149

Open
segyges opened this issue Feb 4, 2024 · 0 comments
Open

"gas" configuration doesn't do anything #149

segyges opened this issue Feb 4, 2024 · 0 comments

Comments

@segyges
Copy link
Contributor

segyges commented Feb 4, 2024

Per this, my understanding is that the gas config in neox doesn't do anything, and shouldn't be used, and should be removed. We should be using gradient_accumulation_steps instead.

It appears that all existing pythia configs set gas to 1, which is the default for gradient_accumulation_steps anyway, so this will not matter. Per that same search some of the old eval results specifically show gas at 2, which would be a bad error and would halve effective batch size if the expectation was that gas did something.

I am not putting in a PR to replace gas with gradient_accumulation_steps because these configs are references for the settings of existing artifacts, so it's not clear to me that they should be fixed to be "correct", or if they are, what the correct steps would be to make sure that they're preserved as references on those artifacts if the configuration is fixed going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant