Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors with Installation #431

Open
wehos opened this issue Mar 27, 2024 · 2 comments
Open

Errors with Installation #431

wehos opened this issue Mar 27, 2024 · 2 comments

Comments

@wehos
Copy link
Collaborator

wehos commented Mar 27, 2024

pip install torch==${PYTORCH_VERSION} torchvision=${TORCHVISION_VERSION} ${PYTORCH_CHANNEL}

torchvision= should be replaced with torchvision==, otherwise 'install.sh' raises error
Fixed by latest PR

@wehos
Copy link
Collaborator Author

wehos commented Mar 27, 2024

The installation of dgl is not valid on HPCC systems. After the installation, dependency issue is raised:

from dance.datasets.multimodality import ModalityMatchingDataset
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/dance/datasets/__init__.py", line 1, in <module>
    from dance.datasets.multimodality import JointEmbeddingNIPSDataset, ModalityMatchingDataset, ModalityPredictionDataset
  File "/dance/datasets/multimodality.py", line 14, in <module>
    from dance.datasets.base import BaseDataset
  File "/dance/datasets/base.py", line 9, in <module>
    from dance.transforms.base import BaseTransform
  File "/dance/transforms/__init__.py", line 1, in <module>
    from dance.transforms import graph
  File "/dance/transforms/graph/__init__.py", line 1, in <module>
    from dance.transforms.graph.cell_feature_graph import CellFeatureBipartiteGraph, CellFeatureGraph, PCACellFeatureGraph
  File "/dance/transforms/graph/cell_feature_graph.py", line 1, in <module>
    import dgl
  File "/dance/lib/python3.11/site-packages/dgl/__init__.py", line 14, in <module>
    from .backend import backend_name, load_backend  # usort: skip
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/dance/lib/python3.11/site-packages/dgl/backend/__init__.py", line 122, in <module>
    load_backend(get_preferred_backend())
  File "/dance/lib/python3.11/site-packages/dgl/backend/__init__.py", line 51, in load_backend
    from .._ffi.base import load_tensor_adapter  # imports DGL C library
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/dance/lib/python3.11/site-packages/dgl/_ffi/base.py", line 50, in <module>
    _LIB, _LIB_NAME, _DIR_NAME = _load_lib()
                                 ^^^^^^^^^^^
  File "/dance/lib/python3.11/site-packages/dgl/_ffi/base.py", line 39, in _load_lib
    lib = ctypes.CDLL(lib_path[0])
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "//dance/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcusparse.so.11: cannot open shared object file: No such file or directory

Simply reproducible with from dance.datasets.multimodality import ModalityMatchingDataset.

@wehos
Copy link
Collaborator Author

wehos commented Mar 27, 2024

export LD_LIBRARY=~/anaconda3/lib (replacing the path with lib directory that contains module files) can solve the issue above.

We need to remind HPCC users about this. The HPC users may need to install cudatoolkit via conda (via conda install cudatoolkit=11.8 -c pytorch) and set LD_LIBRARY environ path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant