Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use <limits> and <cuda/std/limits> in the same source file #107

Open
shwina opened this issue Jul 18, 2022 · 4 comments
Open

Cannot use <limits> and <cuda/std/limits> in the same source file #107

shwina opened this issue Jul 18, 2022 · 4 comments

Comments

@shwina
Copy link

shwina commented Jul 18, 2022

Invoking jitify with the following source file:

#include <limits>
#include <cuda/std/limits>

as follows:

jitify2_preprocess -std=c++11 -D__CUDACC_RTC__ test.hpp

results in:

Error processing source file test.hpp
Compilation failed: NVRTC_ERROR_COMPILATION
Compiler options: "-std=c++11 -D__CUDACC_RTC__ -include=jitify_preinclude.h -default-device"
detail/libcxx/include/limits(211): error: identifier "__CHAR_BIT__" is undefined

detail/libcxx/include/limits(312): error: identifier "__FLT_MANT_DIG__" is undefined

detail/libcxx/include/limits(313): error: identifier "__FLT_DIG__" is undefined

detail/libcxx/include/limits(321): error: identifier "__FLT_RADIX__" is undefined

detail/libcxx/include/limits(325): error: identifier "__FLT_MIN_EXP__" is undefined

<many more similar errors>

As a workaround I can do:

include <limits>
#include <cuda/std/climits>
#include <cuda/std/limits>
@maddyscientist
Copy link
Collaborator

@benbarsdell this is the same issue I reported a while ago. Did you have a chance to think about how to fix this?

@benbarsdell
Copy link
Member

I'll see if I can take another look at this later this week.

@bdice
Copy link

bdice commented Aug 23, 2022

@benbarsdell Hi, any updates on this? I'm reviewing rapidsai/cudf#11287 and would like to understand the issue / what solutions might be possible.

@benbarsdell
Copy link
Member

I believe the root cause of this is the #include <climits> header being loaded from jitify's builtins and cached, and then, when #include "climits" is encountered within libcu++, jitify uses the cached version instead of the new one.

The solution will be to distinguish between #include <foo> and #include "foo" in the header cache. However, it is further complicated by the fact that NVRTC does not support such a distinction. I think the only way around that will be to automatically patch #include "foo" to #include </path/to/foo> (if and only if /path/to/foo exists).

Unfortunately this is easier said than done, which is why I haven't got to it yet.

In terms of workarounds, removing #include <limits> and just using the libcu++ version should work, if that's doable in your code. There may be other workarounds too.

rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Aug 25, 2022
This PR enables using [upstream jitify2](https://github.com/NVIDIA/jitify/tree/jitify2) rather than RAPIDS' fork of [jitify2](https://github.com/rapidsai/jitify/tree/cudf_0.19). 

This enables us to take advantage of the latest additions/improvements to jitify. Most notably: upstream jitify2 dlsym/dlopens `libcuda.so` which enables us to [drop our shared library dependency on `libcuda.so`](#11370).

---

Two major issues came up when making the switch:

1. NVIDIA/jitify#107 - I used the workaround mentioned in that issue. Hopefully it is fixed soon and we can eliminate the workaround.
2. We need to pass `-D_FILE_OFFSET_BITS=64` to jitify. Due to limitations in the way conda-forge builds glibc, we must explicitly state we require 64bit file offset support.

Authors:
  - Ashwin Srinath (https://github.com/shwina)

Approvers:
  - Yunsong Wang (https://github.com/PointKernel)
  - Bradley Dice (https://github.com/bdice)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #11287
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants