Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About batch_size #61

Open
tszslovewanpu opened this issue Mar 16, 2024 · 0 comments
Open

About batch_size #61

tszslovewanpu opened this issue Mar 16, 2024 · 0 comments

Comments

@tszslovewanpu
Copy link

你好,提个概念性的问题,之前没接触过Megatron
1、原来跑模型需要设置batch_size,例如设置成4(4 examples),则一个iteration里消耗4个样本;
如果进一步将minibatch划分为4个micro_batch,则micro_batch_size=1(1 example),可以这么理解吗?
2、训练实例:单机8卡跑,设置参数NNodes=1/GPUs_per_node=8/TP=8/PP=1/DP=8/global_batch_size=1/micro_batch_size=1时,发现每次iteration只吃1个example,这个情况下显存都占满了。是不是我很傻地把1个example切了八份扔进了八个卡里,在做无用功?希望可以指导一下~
感谢赐教!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant