Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access CUDA GPU on WSL #5462

Open
1 task done
benchd opened this issue May 11, 2024 · 1 comment
Open
1 task done

Cannot access CUDA GPU on WSL #5462

benchd opened this issue May 11, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@benchd
Copy link

benchd commented May 11, 2024

Version

nvidia-dali-cuda120:1.37.1, nvidia-dali-nightly-cuda120 1.38.0.dev20240507

Describe the bug.

I've been following #4663 and I'm seeing something similar but cannot figure out why. I can access my gpu on device 0 using nvidia-smi and I can access it using the same conda environment with pytorch so I'm unclear why dali cannot. This is inside a conda environment inside wsl on windows

Minimum reproducible example

Conda envionment:
name: multilabelimage_model_env
channels:
  - pytorch
  - nvidia
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pytorch
  - torchvision
  - torchaudio
  - pytorch-cuda=12.1
  - opencv
  - pandas
  - scikit-learn=1.4.0
  - wandb
  - matplotlib
  - tqdm
  - pillow
  - numpy
  - scipy
  - pyyaml
  - pip
  - pip:
      - torch-summary
      - tensorboard
      - torch-tb-profiler
      - torch-geometric
      - timm

installed DALI using the official installation guide: 
pip install --extra-index-url https://pypi.nvidia.com --upgrade nvidia-dali-cuda120

Also tried with nightly build

Tested with minimal example:
`import nvidia.dali as dali
import numpy as np
@dali.pipeline_def
def my_pipe():
  return dali.fn.external_source(np.array([1,2,3], dtype=np.float32), batch=False).gpu()

pipe = my_pipe(batch_size=1, num_threads=1, device_id=1)
pipe.build()
print(pipe.run())
`

Relevant log output

Minimal example above gets error:

python dali_test.py
/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/backend.py:99: Warning: nvidia-dali-cuda120 is no longer shipped with CUDA runtime. You need to install it separately. cuFFT is typically provided with CUDA Toolkit installation or an appropriate wheel. Please check https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#pip-wheels-installation-linux for the reference.
  deprecation_warning(
/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/backend.py:110: Warning: nvidia-dali-cuda120 is no longer shipped with CUDA runtime. You need to install it separately. NPP is typically provided with CUDA Toolkit installation or an appropriate wheel. Please check https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#pip-wheels-installation-linux for the reference.
  deprecation_warning(
/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/backend.py:121: Warning: nvidia-dali-cuda120 is no longer shipped with CUDA runtime. You need to install it separately. nvJPEG is typically provided with CUDA Toolkit installation or an appropriate wheel. Please check https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#pip-wheels-installation-linux for the reference.
  deprecation_warning(
Traceback (most recent call last):
  File "/mnt/c/Coding/Testing/PyTorch/MultiLabelClassification_Patreon/actual_real_user_code/dali_test.py", line 8, in <module>
    pipe.build()
  File "/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/pipeline.py", line 979, in build
    self._init_pipeline_backend()
  File "/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/pipeline.py", line 813, in _init_pipeline_backend
    self._pipe = b.Pipeline(
                 ^^^^^^^^^^^
RuntimeError: CUDA runtime API error cudaErrorInvalidDevice (101):
invalid device ordinal

Other/Misc.

Found similar issues but could not find a solution

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report
@benchd benchd added the bug Something isn't working label May 11, 2024
@mzient
Copy link
Contributor

mzient commented May 13, 2024

Hello @benchd,
Please check your device id. You said you can access "device 0", but your DALI snippet specifies device 1.

pipe = my_pipe(batch_size=1, num_threads=1, device_id=1)
                                            ^^^^^^^^^^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants