Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--batch vs --batch-gpu #412

Open
nadavpo opened this issue Jul 23, 2023 · 1 comment
Open

--batch vs --batch-gpu #412

nadavpo opened this issue Jul 23, 2023 · 1 comment

Comments

@nadavpo
Copy link

nadavpo commented Jul 23, 2023

in the configs.md you mentioned the --batch-gpu parameter.
you also show example of running with 1 gpu where you set the --batch to 32 and the --batch-gpu to 16. what effect it have? if the batch size is 32, you restrict the samples per batch per gpu to 16 and you have only one gpu, doesn't that mean that your batch size is 16?

@PDillis
Copy link

PDillis commented Oct 20, 2023

Hello, I know this is an old question, but what they did is gradient accumulation: say you want to do a backward pass on a batch size of 32, but you can only fit a batch of 16 on your current GPU. So you accumulate two forward passes, then do the backward pass. It's this reason why, if you don't specify --batch-gpu and ony --batch, then this batch size will be divided by the number of GPUs you are using. Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants