-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provenance of NVTX headers in NCCL #1270
Comments
We are using a snapshot of the headers from an internal repo. Funny you should ask though -- I've been pushing on the parties involved to get them in shape for the last few weeks, as they've been causing compilation warnings, especially with clang. Let me ping them again... So I'm hoping that the next NCCL release will have an updated version, but it won't be the same as what's in the current CUDA release. |
I can help with clang-related issues. If the warnings are related to C++, clang is usually more strict than nvcc, and usually has a good reason to complain.
It does not have to be the same, as long as they can be used instead of the headers in CUDA-12. It would work for us short-term, as long as we have a variant that works for everyone. Long term I would expect to see problems if/when NCCL's copy and NVTX repo and/or the version shipped with CUDA start to diverge. Shipping a fork as part of another library is a potential timebomb. |
Fixing the clang-related issues is not a problem; it's just that we didn't want to diverge from the upstream, which is what prompted us to attempt to resync with them. |
FYI, in the next NCCL version we will be using the NVTX headers based on https://github.com/NVIDIA/NVTX/tree/dev. |
NCCL appears to carry some sort of fork of
nvtx3
headers that match neither the headers in https://github.com/NVIDIA/nvtx nor the nvtx3 headers distributed with CUDA-12.x. Unfortunately, NCCL commit history does not mention the provenance of these headers, so it's not clear if they are a snapshot of some WIP branch in NVIDIA's internal repo, or if it's something developed by/for NCCL.My issue is that when I build an application with CUDA-12, I generally need to compile everything with one set of those headers. Debugging ODR violations caused by different parts of the build including different variants of the headers is not fun.
Would it be possible to switch NCCL to use pristine upstream NVTX headers, and either keep NCCL-specific extensions only, or upstream the changes if they are not NCCL-specific?
@jrhemstad, @sjeaugey
The text was updated successfully, but these errors were encountered: