Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diffrent behavior between cpu and cuda #163

Open
Boltzmachine opened this issue Aug 19, 2021 · 1 comment
Open

diffrent behavior between cpu and cuda #163

Boltzmachine opened this issue Aug 19, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@Boltzmachine
Copy link

Boltzmachine commented Aug 19, 2021

when the cuda devices are invisible, my program runs well.
but when the cuda devices are available, it reports the error

Traceback (most recent call last):
  File "/home/boltzmachine/THG/train.py", line 10, in <module>
    from models.simple import Simple, Share
  File "/home/boltzmachine/THG/models/simple.py", line 5, in <module>
    from .GNN import DenseGatedRGCN
  File "/home/boltzmachine/THG/models/GNN.py", line 10, in <module>
    from torch_geometric.nn.inits import glorot, zeros
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/__init__.py", line 5, in <module>
    import torch_geometric.data
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/data/data.py", line 8, in <module>
    from torch_sparse import coalesce, SparseTensor
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_sparse/__init__.py", line 15, in <module>
    f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
AttributeError: 'NoneType' object has no attribute 'origin'

I use torch 1.7.0 and cu101
I uninstalled torch-sparse repeatedly until there's nothing install
and I installed by

pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip --no-cache-dir install torch-geometric

It seems that there's no *.so of cuda in ~/miniconda3/envs/THG/lib/python3.7/site-packages/torch/cuda/

__init__.py      _diag_cpu.so        _metis_cpu.so    _saint_cpu.so   _spspmm_cpu.so   bandwidth.py  convert.py  index_select.py   metis.py   padding.py  rw.py      select.py  storage.py    utils.py
__pycache__      _ego_sample_cpu.so  _relabel_cpu.so  _sample_cpu.so  _version_cpu.so  cat.py        diag.py     masked_select.py  mul.py     permute.py  saint.py   spmm.py    tensor.py
_convert_cpu.so  _hgt_sample_cpu.so  _rw_cpu.so       _spmm_cpu.so    add.py           coalesce.py   eye.py      matmul.py         narrow.py  reduce.py   sample.py  spspmm.py  transpose.py

One possible reason is that I am using a computer cluster. In my local environment, there is no cuda available until I submit a job by slurm

@rusty1s
Copy link
Owner

rusty1s commented Aug 20, 2021

The *_cuda.so files should be available nonetheless, even if there is no GPU available until a job is submitted. I therefore think that the issue is that installing from wheels fails for you. How long does the installation take? Can you try again with:

pip install --no-index torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html

@rusty1s rusty1s added the bug Something isn't working label Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants