Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to link CUDA functions with in-memory PTX. #9470

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ed-o-saurus
Copy link

@ed-o-saurus ed-o-saurus commented Feb 27, 2024

This commit adds the PTXCode class. It is a simple wrapper around a string. This allows the user to link CUDA functions with dynamically generated PTX code without having to write data to a file.

a string. This allows the user to link CUDA functions with
dynamically generated PTX code without having to write data to a
file.
@gmarkall gmarkall added the CUDA CUDA related issue/PR label Feb 28, 2024
Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the PR, @ed-o-saurus - this is an often-requested feature so I'm happy to see this PR!

There are a few thoughts I need to add on the design and scope of the changes, but I like the idea of using an object to hold the code whilst keeping strings being used for paths, maintaining backwards-compatibility.

I'll post a follow-up once I've had more time to get my thoughts down.

isVoid pushed a commit to rapidsai/pynvjitlink that referenced this pull request Mar 19, 2024
Adds support for linking code from memory to Numba's `@cuda.jit`
decorator (in addition to the already-supported linking files from
disk). New classes are added to Numba's top-level `cuda` module:

* `Archive`: An archive of objects
* `CUSource`: A CUDA C/C++ source
* `Cubin`: A cubin ELF
* `Fatbin`: A fatbin ELF
* `Object`: An object file
* `PTXSource`: PTX assembly source code.

These are all used by constructing them with a single argument, the code
in memory to use. Once created, they can then be passed to the `link=`
kwarg of the `@cuda.jit` decorator. An example showing a use case with a
CUDA C/C++ source is added.

This implementation is inspired by the approach outlined by @ed-o-saurus
in numba/numba#9470.

Notes on changes:
* Various tests now run on the GPU, so test binaries need to be
generated using the relevant compute capability - this change is applied
in the `Makefile`, along with some refactoring to tidy it up a little.
* Test for new functionality are added. In addition, existing tests that
used the test binaries are all modified such that they use a relevant
compute capability (usually the one in the test machine, for most of
them) because the test binaries are now built for the CC of the current
GPU - it's no longer sufficient to hard code CCs like 7.5 or 7.0 for
tests. Since fixtures are needed across test files, I've started moving
them into `conftest.py`.

---------

Co-authored-by: Bradley Dice <bdice@bradleydice.com>
Co-authored-by: jakirkham <jakirkham@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA CUDA related issue/PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants