Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using __half with NVRTC and jitify #56

Open
qavl opened this issue Apr 8, 2020 · 4 comments
Open

Using __half with NVRTC and jitify #56

qavl opened this issue Apr 8, 2020 · 4 comments

Comments

@qavl
Copy link

qavl commented Apr 8, 2020

Hi!
I want to use fp16 data type in kernels compiled with NVRTC. However, when I try to do so, I get following errors:

warning: cuda_fp16.h: File not found
---------------------------------------------------
--- JIT compile log for ...---
---------------------------------------------------
error: identifier "__half" is undefined


Can you tell me, what is the best way to use cuda_fp16.h with jitify?

Thank you.

@benbarsdell
Copy link
Member

To ensure cuda_fp16.h can be found you'll need to pass the CUDA Toolkit include directory as a flag like this: -I/path/to/cuda/include.

Here's a minimal example (it uses half which is equivalent to __half):

  jitify::JitCache jit_cache;
  jit_cache
      .program(R"(fp16_program
#include <cuda_fp16.h>
__device__ half value;
)",
               {}, {"-I/usr/local/cuda/include"})
      .kernel("")
      .instantiate();

@qavl
Copy link
Author

qavl commented Apr 9, 2020

Ok, I see.
Thank you)
I guess, one need to make a workaround to pass the exact cuda path at runtime to make it portable, right?

@benbarsdell
Copy link
Member

That's right. One option would be to use an environment variable like CUDA_PATH.

@maddyscientist
Copy link
Collaborator

We don't need another environment variable surely, we can just use the JITIFY_OPTIONS envarg to set this at run time already?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants