Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing hipification of CuPy for ROCm compatibility with CUDA #8316

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

AdrianAbeyta
Copy link
Contributor

Overview

This pull request introduces the hipify utility to facilitate the conversion of CUDA C/C++/Cython code into HIP C/C++/Cython code within CuPy. Hipify offers a straightforward search-and-replace mechanism that maps CUDA constructs to their HIP equivalents, allowing CuPy to support AMD's ROCm ecosystem seamlessly. This addition particularly targets the conversion of CUDA files to HIP equivalents for CUDA/Stub builds, as detailed in the modified setup.py.

Main Changes

Integration of hipify:
The primary modification is in setup.py (see relevant section):

  • Determines whether to trigger the hipification process based on the presence of ROCm.

  • Utilizes hipify to generate HIP-compatible files by recursively converting the CUDA codebase.

  • Implements a custom mapping configuration, ensuring a precise mapping from CUDA constructs to HIP constructs.

Custom mapping and hipify mechanism:

  • Introduces a specific mapping, converting CUPY_USE_GEN_HIP_CODE to CUPY_DONT_USE_GEN_HIP_CODE.

  • Facilitates a mechanism for selecting the appropriate libraries only if on ROCm platforms; defaults to legacy CUDA code otherwise.

cuRAND Example:

An example integration is provided for cuRAND, demonstrating the specific mapping and library-switching mechanism between HIP and legacy CUDA versions.(see mechanism here)

Impact

This update ensures that CuPy can smoothly support both CUDA and ROCm, widening its compatibility range. It reduces manual intervention in maintaining both ecosystems and streamlines the conversion process, particularly in CUDA-to-HIP mappings. Importantly, existing CUDA code remains unchanged, ensuring compatibility for NVIDIA GPUs.

Copy link
Contributor

mergify bot commented May 1, 2024

This pull request is now in conflicts. Could you fix it @AdrianAbeyta? 🙏

@takagi takagi self-assigned this May 2, 2024
@takagi takagi added cat:enhancement Improvements to existing features prio:medium labels May 2, 2024
Comment on lines +129 to +130
compile_time_env['CUPY_USE_GEN_HIP_CODE'] = 1
compile_time_env['CUPY_DONT_USE_GEN_HIP_CODE'] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both of them? Doesn't compile_time_env['CUPY_USE_GEN_HIP_CODE'] = 0 mean compile_time_env['CUPY_DONT_USE_GEN_HIP_CODE'] = 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we currently need both environment flags for ROCm: CUPY_USE_GEN_HIP_CODE and CUPY_DONT_USE_GEN_HIP_CODE. These flags serve different purposes at different stages of the process.

Compile Time:
During the compilation of HIP files, CUPY_USE_GEN_HIP_CODE indicates that HIP code generation is enabled. Hipify then completes a custom mapping at compile time from CUPY_USE_GEN_HIP_CODE to CUPY_DONT_USE_GEN_HIP_CODE see implementation here.

Runtime:
During runtime, we use the new custom mapping with CUPY_DONT_USE_GEN_HIP_CODE. This is done to avoid circular import errors caused by Cython constraints during runtime.

Thus, CUPY_USE_GEN_HIP_CODE=0 for compilation does not automatically imply CUPY_DONT_USE_GEN_HIP_CODE=1 for runtime. They are set independently to handle different phases (compilation vs. runtime) without causing conflicts or errors.

@@ -77,6 +77,7 @@ jobs:

- name: Build
run: |
pip install git+https://github.com/ROCmSoftwarePlatform/hipify_torch.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need hipify_torch even on CUDA build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we do not need hipify_torch on CUDA build. I updated the pretest.yml to reflect these changes. Thanks!

@@ -87,6 +88,7 @@ jobs:
python -Werror::DeprecationWarning -m compileall -f -q cupy cupyx examples tests docs
pushd docs
pip install -r requirements.txt
pip install -r rocm-requirements.txt
Copy link
Member

@takagi takagi May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we need hipify_torch on doc build? I suppose doc building uses stub files and is independent of ROCm stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also do not need hipify_torch for the doc build since it uses stub files as you mentioned. Updated the pretest.yml to reflect this.

@takagi
Copy link
Member

takagi commented May 8, 2024

docker: failed to register layer: write /opt/rocm-6.1.0/lib/llvm/bin/flang-legacy: no space left on device.
https://github.com/cupy/cupy/actions/runs/8914357746/job/24481782055

It seems that this is because the rocm docker image is getting larger than before (rocm-5.0). I checked container sizes that pulled rocm images (see SIZE column):

$ docker system df -v
REPOSITORY                                                                         TAG                        IMAGE ID       CREATED         SIZE      SHARED SIZE   UNIQUE SIZE   CONTAINERS
rocm/dev-ubuntu-22.04                                                              6.1-complete               0708dce5557e   2 weeks ago     26.2GB    0B            26.23GB       1
rocm/dev-ubuntu-20.04                                                              5.0-complete               5d4953aabf2a   2 years ago     11.2GB    0B            11.24GB       1

Also, according to this document, we have only 14GB disk space on GitHub-hosted runners.
https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

@takagi
Copy link
Member

takagi commented May 8, 2024

I'm not sure how we should run build-rocm job on GitHub-hosted runners, however, I'm going to temporarily replicate and update #7964 to test the changes on this PR work on the self-hosted runners.

@takagi
Copy link
Member

takagi commented May 8, 2024

Also I will update #7964 itself to make it ready to merge, which may require our members to confirm our policy to have self-hosted runners on a public repository.

@kmaehashi kmaehashi self-assigned this May 8, 2024
@kmaehashi
Copy link
Member

Thank you for the contribution and all the help improving support for ROCm in CuPy! Regarding this pull request, I'm a bit concerned about the potential instability this may introduce and the negative impact it could have on both user and developer experience:

  • This change would require users to install hipify-torch in order to build CuPy for ROCm, which adds an extra step and potential barrier to the installation process.
  • All contributors modifying CUDA code in the CuPy repository would need to be familiar with hipify-torch. Given that hipify-torch uses search-and-replace approach, I'm assuming it's highly likely to encounter failures in edge cases.
  • Hipified files are left in the source tree. They are recognized as untracked files by Git and could be noisy or accidentally committed. (as shown below)
  • Since hipify-torch is not versioned, the build results of CuPy could vary uncontrollably based on the latest commit in the hipify-torch repository (https://github.com/ROCm/hipify_torch).

CuPy has a ROCm support that is already mature, and in my current understanding, the introduction of hipify-torch does not appear to provide significant value compared to the drawbacks mentioned above.

% pip install -e . -v
Using pip 24.0 from /home/maehashi/.pyenv/versions/3.10.13/envs/local-3.10.13/lib/python3.10/site-packages/pip (python 3.10)
Obtaining file:///home/maehashi/Development/cupy
  Running command python setup.py egg_info
  Generating cache key from header files...
  Cache key (1700 files matching /home/maehashi/Development/cupy/cupy/_core/include/**): 65dbfe98a23933f1a73cec9264029f473436ff11
  INFO: hipification of cupy_backends in progress ...
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cuda.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_nccl.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_nccl_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusparse.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusparse_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cublas.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cublas_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cuda_profiler_api.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_hip_profiler_api.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/stream.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/stream_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cutensor.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cutensor_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/__init__.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/__init__.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusolver.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusolver_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cudnn.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cudnn.h [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cuda_runtime.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_hip_runtime.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_extern.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_extern_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_typedef.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_typedef_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_softlink.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_softlink.pxi [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/__init__.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/__init__.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_typedef.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_typedef_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_extern.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_extern_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/profiler.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/profiler_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/__init__.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/__init__.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/_cnvrtc.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/_cnvrtc_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand_hip.pxd [ok]
  Successfully preprocessed all matching files.
...

@AdrianAbeyta
Copy link
Contributor Author

docker: failed to register layer: write /opt/rocm-6.1.0/lib/llvm/bin/flang-legacy: no space left on device.
https://github.com/cupy/cupy/actions/runs/8914357746/job/24481782055

It seems that this is because the rocm docker image is getting larger than before (rocm-5.0). I checked container sizes that pulled rocm images (see SIZE column):

$ docker system df -v
REPOSITORY                                                                         TAG                        IMAGE ID       CREATED         SIZE      SHARED SIZE   UNIQUE SIZE   CONTAINERS
rocm/dev-ubuntu-22.04                                                              6.1-complete               0708dce5557e   2 weeks ago     26.2GB    0B            26.23GB       1
rocm/dev-ubuntu-20.04                                                              5.0-complete               5d4953aabf2a   2 years ago     11.2GB    0B            11.24GB       1

Also, according to this document, we have only 14GB disk space on GitHub-hosted runners. https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

Thanks for pointing this out. As ROCm versions have evolved, the base ROCm images have grown in size. I have added a cleanup of unnecessary items that take up some runner disk space to account for this. See modification here

@AdrianAbeyta
Copy link
Contributor Author

Thank you for the contribution and all the help improving support for ROCm in CuPy! Regarding this pull request, I'm a bit concerned about the potential instability this may introduce and the negative impact it could have on both user and developer experience:

  • This change would require users to install hipify-torch in order to build CuPy for ROCm, which adds an extra step and potential barrier to the installation process.
  • All contributors modifying CUDA code in the CuPy repository would need to be familiar with hipify-torch. Given that hipify-torch uses search-and-replace approach, I'm assuming it's highly likely to encounter failures in edge cases.
  • Hipified files are left in the source tree. They are recognized as untracked files by Git and could be noisy or accidentally committed. (as shown below)
  • Since hipify-torch is not versioned, the build results of CuPy could vary uncontrollably based on the latest commit in the hipify-torch repository (https://github.com/ROCm/hipify_torch).

CuPy has a ROCm support that is already mature, and in my current understanding, the introduction of hipify-torch does not appear to provide significant value compared to the drawbacks mentioned above.

% pip install -e . -v
Using pip 24.0 from /home/maehashi/.pyenv/versions/3.10.13/envs/local-3.10.13/lib/python3.10/site-packages/pip (python 3.10)
Obtaining file:///home/maehashi/Development/cupy
  Running command python setup.py egg_info
  Generating cache key from header files...
  Cache key (1700 files matching /home/maehashi/Development/cupy/cupy/_core/include/**): 65dbfe98a23933f1a73cec9264029f473436ff11
  INFO: hipification of cupy_backends in progress ...
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cuda.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_nccl.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_nccl_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusparse.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusparse_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cublas.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cublas_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cuda_profiler_api.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_hip_profiler_api.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/stream.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/stream_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cutensor.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cutensor_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/stream.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/__init__.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/__init__.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusolver.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cusolver_hip.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/_softlink.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cudnn.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cudnn.h [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_cuda_runtime.h -> /home/maehashi/Development/cupy/cupy_backends/cuda/cupy_hip_runtime.h [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_extern.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_extern_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_typedef.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_typedef_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_enum.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_softlink.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_softlink.pxi [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_enum.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/runtime_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/__init__.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/__init__.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_typedef.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_driver_typedef_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_extern.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/_runtime_extern_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/api/driver_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pyx [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/profiler.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/profiler_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusolver_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cutensor_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nccl_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/__init__.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/__init__.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvrtc_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cublas_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt.pxd [skipped, no changes]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse.cpp -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparse_hip.cpp [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/_cnvrtc.pxi -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/_cnvrtc_hip.pxi [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cusparselt_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx.pyx -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/nvtx_hip.pyx [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/cudnn_hip.pxd [ok]
  /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand.pxd -> /home/maehashi/Development/cupy/cupy_backends/cuda/libs/curand_hip.pxd [ok]
  Successfully preprocessed all matching files.
...

Thank you for your insights @kmaehashi,

While hipify-torch adds an extra installation step, it aligns us with industry practices and uses PyTorch's proven method for ROCm users. Our current ROCm support is mature, however it does not fully utilize newer ROCm capabilities. Adopting hipify-torch keeps CUDA and ROCm aligned, ensuring functional parity, and minimizes changes on the CUDA side. We hope it will make the development process streamlined, preventing it from falling behind CUDA.

Thanks for addressing the user/dev experience feedback, after some thought on this I was thinking we could make the user/dev experience better by adding in the following:

  • We want to have more open-source community engagement and welcome any suggestions to streamline the development process. For edge cases or any questions that arise, please involve the CuPy ROCm developers to help address any issues. Mentioned here @pnunna93, @lcskrishna, @bmedishe and @shbiswas834. I can also collaborate with the hipify-torch team to provide a more comprehensive README.

  • To address the issue of hipified files in the source tree, we can update our .gitignore configurations. This should help keep the repository organized, reducing noise, and preventing accidental commits of these files. I have proposed a solution here.

  • I discussed the stability concern with hipify-torch with the PyTorch team, and they mentioned they currently ensure stability by pinning to specific commits. We can adopt this approach to prevent build variations and maintain consistency an example can be found here. Additionally, I can collaborate with the hipify-torch team to provide a more comprehensive versioning support in the future that we can track.

I appreciate the feedback and am open to other suggestions on how to make the user/dev experience more positive with this change.

@AdrianAbeyta
Copy link
Contributor Author

Also I will update #7964 itself to make it ready to merge, which may require our members to confirm our policy to have self-hosted runners on a public repository.

Thanks, @takagi. Please let me know if there's anything I can do to help. I understand that some teams using self-hosted runners on public repositories have a setup where a maintainer needs to approve a pull request before CI workflows can run. This is done using the "Approve and Run" button for added security see approving-workflow-runs-from-public-forks for more information.

@AdrianAbeyta AdrianAbeyta requested a review from takagi May 17, 2024 17:27
@takagi
Copy link
Member

takagi commented May 20, 2024

Thanks, @AdrianAbeyta. Our team had a talk about "Approve and Run" button and we would not like to use it because, even on workflows using a GitHub-hosted Runner, e.g. for linting and document building, it requires manual approval on every commit. We tried it before and feel it is too cumbersome.

Instead, we think of another solution. We plan to have two repositories, A (public, used for development) and A-fork (private, used for testing, with self-hosted runner registered). A test comment to a pull request on A will trigger pushing the commit from A to A-fork, then starting the CI workflow on A-fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:enhancement Improvements to existing features prio:medium
Projects
Status: 🏭 In progress
Development

Successfully merging this pull request may close these issues.

None yet

3 participants