Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD EPYC 7K62 NCCL-test 4090 bandwidth too #1285

Open
ghoul02015 opened this issue May 13, 2024 · 1 comment
Open

AMD EPYC 7K62 NCCL-test 4090 bandwidth too #1285

ghoul02015 opened this issue May 13, 2024 · 1 comment

Comments

@ghoul02015
Copy link

ghoul02015 commented May 13, 2024

WechatIMG48
WechatIMG49
WechatIMG50
CPU:AMD EPYC 7K62
mem:32G *16
AMD EPYC Why is p2p performance so slow? Is there anything to set

@sjeaugey
Copy link
Member

sjeaugey commented May 13, 2024

You may try to set NCCL_P2P_LEVEL=PXB and see whether it works better.

Note that p2pBandwidth only creates traffic between 2 GPUs at once. Performance tends to degrade significantly when we have traffic going from/to all GPUs and interleaving in the CPU. It would also be interesting to see what performance you get with 2 GPUs only (which would be kind of equivalent to the p2pBandwidth test in bidirectional mode and --sm_copy mode).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants