Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable building C++ code with GPU (CUDA) support #908

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

fabratu
Copy link
Member

@fabratu fabratu commented Mar 22, 2022

This PR add support for CUDA based GPU computing.

The following thoughts went into the PR:

  • CMake responsible for CUDA detection, triggered by a new parameter NETWORKIT_CUDA. If CUDAToolkit is found, all .cu files are automatically compiled with nvcc. Maybe we can also support clang -x cuda in the future? If toolkit is not found, .cu files are marked as C++ language files in order to get compiled. Another possibility would be to track the files, which need to be compiled with either nvcc or clang -x cuda and to keep the normal .cpp file ending.
  • setup.py only uses simple check to decide, whether to parse NETWORKIT_CUDA to CMake or not. It searches for nvidia-smi tool, which is installed on Linux + Windows if the official drivers are used. The check is triggered, when passing enable-gpu to build_ext. Maybe that is one check too much and we can just rely on autodetection of CUDA GPUs.
  • Preprocessor check by using __CUDACC__ macro. All CUDA specific code should be wrapped in a #ifdef-check in order to compile also on systems without GPU (similar to AVX2). Even though some CUDA specific macros like __global__, __host__ and so on can be also wrapped in another macro (similar to NETWORKIT_EXPORT for Windows builds), there are certain common variables (like threadIdx), which would make a ifdef wrap necessary for kernel functions anyway.
  • Helper class Aux::GPUTools for wrapping CUDA calls and management of GPU devices. This detaches GPU handling from algorithm implementation and removes the need for cuda-headers for header files of algorithms (see temp. ToyCentrality.hpp/cu example). Can also help with runtime detection of GPUs and support detaching Cython interface from GPU code. Otherwise we either have to compile all Cython modules with at least one GPU class with nvcc/clang -x cuda or add an additional module for all GPU-enabled classes.
  • Gtest for helper class (working on GPU and non-GPU hardware).
  • This will be removed before merging the PR of course (added to show how integration is done in C++): ToyCentrality.hpp/cu show usage of preprocessor and runtime checking for a simple GPU kernel. The kernel function has to be outside the class-definition in order to work properly. Tested on both CPU and GPU hardware.

What is missing (but likely out of scope of this PR):

  • Github Actions doesn't provide runner with GPUs. We would need an appropriate self-hosted runner (both Linux + Windows) with such capabilites to test the code.

@fabratu fabratu force-pushed the 20220314_cuda_bfs branch 11 times, most recently from 6f70415 to beee6b0 Compare March 24, 2022 12:04
@fabratu fabratu force-pushed the 20220314_cuda_bfs branch 3 times, most recently from ec482ba to c3a60cf Compare August 5, 2022 15:27
@fabratu fabratu changed the title [WIP] Enable building C++ code with GPU (CUDA) support Enable building C++ code with GPU (CUDA) support Aug 5, 2022
@fabratu
Copy link
Member Author

fabratu commented Sep 15, 2022

It seems my current implementation mixes both driver and runtime API from CUDA. While this is technically ok according to documentation, either more low-level control concerning context management (threads, processes) is needed or we switch to runtime API only and use the implicit management (with some unknown overhead).

Fabian Brandt-Tumescheit added 2 commits December 5, 2022 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant