Skip to content

Poor GPU utilization when using multi-node ZeRO-3 on small model #5488

Closed Answered by BramVanroy
BramVanroy asked this question in Q&A
Discussion options

You must be logged in to vote

For whoever reads this: my disastrous performance was caused by something else, I don't really remember why but I had CUDA_LAUNCH_BLOCKING=1 in my launching bash script, which caused the significant slow-down.

Replies: 4 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
2 replies
@BramVanroy
Comment options

@tjruwase
Comment options

Comment options

You must be logged in to vote
1 reply
@tjruwase
Comment options

Answer selected by BramVanroy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants