Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFace Interface works on Colab but not on local jupyter setup #55

Closed
atul-ai opened this issue Mar 24, 2024 · 2 comments
Closed

Comments

@atul-ai
Copy link

atul-ai commented Mar 24, 2024

Hi - I have Ubuntu23.10 Mantic with a couple of GPUs, 3090 being the biggest one. I want to run a batch of Marathi - English Translation with IndicTrans2, when I try to run the huggingface_interface on my local machine with JupyterLab, it errors out with following error -

`

ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 2
1 import torch
----> 2 from transformers import AutoModelForSeq2SeqLM, BitsAndBytesConfig
3 from IndicTransTokenizer import IndicProcessor, IndicTransTokenizer
5 BATCH_SIZE = 4

File ~/.local/lib/python3.11/site-packages/transformers/init.py:26
23 from typing import TYPE_CHECKING
25 # Check the dependencies satisfy the minimal versions required.
---> 26 from . import dependency_versions_check
27 from .utils import (
28 OptionalDependencyNotAvailable,
29 _LazyModule,
(...)
47 logging,
48 )
51 logger = logging.get_logger(name) # pylint: disable=invalid-name

File ~/.local/lib/python3.11/site-packages/transformers/dependency_versions_check.py:16
1 # Copyright 2020 The HuggingFace Team. All rights reserved.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
15 from .dependency_versions_table import deps
---> 16 from .utils.versions import require_version, require_version_core
19 # define which module versions we always want to check at run time
20 # (usually the ones defined in install_requires in setup.py)
21 #
22 # order specific notes:
23 # - tqdm must be checked before tokenizers
25 pkgs_to_check_at_runtime = [
26 "python",
27 "tqdm",
(...)
37 "pyyaml",
38 ]

File ~/.local/lib/python3.11/site-packages/transformers/utils/init.py:33
24 from .constants import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD, IMAGENET_STANDARD_MEAN, IMAGENET_STANDARD_STD
25 from .doc import (
26 add_code_sample_docstrings,
27 add_end_docstrings,
(...)
31 replace_return_docstrings,
32 )
---> 33 from .generic import (
34 ContextManagers,
35 ExplicitEnum,
36 ModelOutput,
37 PaddingStrategy,
38 TensorType,
39 add_model_info_to_auto_map,
40 cached_property,
41 can_return_loss,
42 expand_dims,
43 find_labels,
44 flatten_dict,
45 infer_framework,
46 is_jax_tensor,
47 is_numpy_array,
48 is_tensor,
49 is_tf_symbolic_tensor,
50 is_tf_tensor,
51 is_torch_device,
52 is_torch_dtype,
53 is_torch_tensor,
54 reshape,
55 squeeze,
56 strtobool,
57 tensor_size,
58 to_numpy,
59 to_py_obj,
60 transpose,
61 working_or_temp_dir,
62 )
63 from .hub import (
64 CLOUDFRONT_DISTRIB_PREFIX,
65 HF_MODULES_CACHE,
(...)
91 try_to_load_from_cache,
92 )
93 from .import_utils import (
94 ACCELERATE_MIN_VERSION,
95 ENV_VARS_TRUE_AND_AUTO_VALUES,
(...)
200 torch_only_method,
201 )

File ~/.local/lib/python3.11/site-packages/transformers/utils/generic.py:442
438 return tuple(self[k] for k in self.keys())
441 if is_torch_available():
--> 442 import torch.utils._pytree as _torch_pytree
444 def _model_output_flatten(output: ModelOutput) -> Tuple[List[Any], "_torch_pytree.Context"]:
445 return list(output.values()), list(output.keys())

File ~/.local/lib/python3.11/site-packages/torch/utils/init.py:4
1 import os.path as _osp
2 import torch
----> 4 from .throughput_benchmark import ThroughputBenchmark
5 from .cpp_backtrace import get_cpp_backtrace
6 from .backend_registration import rename_privateuse1_backend, generate_methods_for_privateuse1_backend

File ~/.local/lib/python3.11/site-packages/torch/utils/throughput_benchmark.py:2
----> 2 import torch._C
5 def format_time(time_us=None, time_ms=None, time_s=None):
6 """Define time formatting."""

ModuleNotFoundError: No module named 'torch._C'
`
This same code works on Google Colab. I have around 7K articles to translate, Google colab does not stay up for that long and I would like to avoid paying for this.

@prajdabre
Copy link
Collaborator

prajdabre commented Mar 24, 2024

Hi

I am assuming that you downloaded the colab notebook as is and tried to run it locally on jupyterlabs and it didnt work. Please correct me if I am wrong.

Regardless,

  1. Can you share your pip freeze and/or conda list?
  2. Also can you share how you set up your local environment?
  3. Did you use conda or virtualenv?
  4. Did you install the necessary libraries via the notebook itself (!pip install xyz) or explicitly on the command line? --- I would not expect the former to work all the time. Colab is a bit special in that it handles installations well. Whenever I have used jupyter to install stuff from the notebook it sometimes does not do it properly.
  5. Did you ensure that the environment in Colab and your local machine is the mostly same? Often the wrong background libraries lead to issues. For example, if you dont have the correct GCC version, things dont work out. Having the right CUDA version also matters else torch does not get installed properly.

What would help even more would be if you share a snapshot of your local notebook with the outputs of each cell up until things failed.

I did a quick google search on your error and the following seem to have an extensive discussion:

  1. ModuleNotFoundError: No module named 'torch._C' pytorch/pytorch#574
  2. ModuleNotFoundError: No module named 'torch' pytorch/pytorch#4827 (most reliable since it talks about issues with notebooks which work well via conda install rather than pip install)

Overall its an environment setup issue and should be solved relatively easily. Answers to these will help us isolate the issue.

@atul-ai
Copy link
Author

atul-ai commented Mar 24, 2024

  1. Here is my pip freeze
    absl-py==1.4.0 accelerate==0.28.0 aiofiles==23.2.1 aiohttp==3.9.3 aiosignal==1.3.1 alabaster==0.7.16 altair==5.2.0 annotated-types==0.6.0 antlr4-python3-runtime==4.8 anyio==4.3.0 apt-xapian-index==0.49 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==2.4.1 astunparse==1.6.3 async-lru==2.0.4 attrs==23.2.0 Babel==2.14.0 bcrypt==3.2.2 beautifulsoup4==4.12.3 bitarray==2.9.2 bitsandbytes==0.43.0 bleach==6.1.0 blinker==1.6.2 Brlapi==0.8.4 Brotli==1.0.9 certifi==2022.9.24 cffi==1.16.0 chardet==5.1.0 click==7.1.2 colorama==0.4.6 comm==0.2.1 command-not-found==0.3 contourpy==1.2.0 cryptography==38.0.4 ctranslate2==3.9.0 cupshelpers==1.0 cycler==0.12.1 Cython==3.0.9 datasets==2.18.0 dbus-python==1.3.2 debugpy==1.8.1 decorator==5.1.1 defer==1.0.6 defusedxml==0.7.1 dill==0.3.8 distlib==0.3.7 distro==1.8.0 distro-info==1.5+ubuntu0.23.10.1 docopt==0.6.2 docutils==0.20.1 duplicity==1.2.2 entrypoints==0.4 executing==2.0.1 fairseq @ file:///home/atul/IndicTrans2/fairseq fastapi==0.110.0 fasteners==0.17.3 fastjsonschema==2.19.1 ffmpy==0.3.2 filelock==3.12.2 flatbuffers==24.3.7 fonttools==4.50.0 fqdn==1.5.1 frozenlist==1.4.1 fsspec==2024.2.0 fuse-python==1.0.5 future==0.18.2 gast==0.5.4 google-pasta==0.2.0 googleapis-common-protos==1.63.0 gpg==1.18.0 gradio==4.21.0 gradio_client==0.12.0 grpcio==1.62.1 h11==0.14.0 h5py==3.10.0 httpcore==1.0.4 httplib2==0.20.4 httpx==0.27.0 huggingface-hub==0.21.4 hydra-core==1.0.7 idna==3.3 imagesize==1.4.1 importlib-metadata==4.12.0 importlib_resources==6.3.1 indic-nlp-library @ file:///home/atul/IndicTrans2/indic_nlp_library indic-nlp-library-IT2 @ git+https://github.com/VarunGumma/indic_nlp_library@18e85551b0b331a65d11acf869afd4cfa615bc4c -e git+https://github.com/VarunGumma/IndicTransTokenizer@e20d2d762fa20dea3b87858a4e38adf93ddfe50f#egg=IndicTransTokenizer ipykernel==6.29.3 ipython==8.22.2 ipywidgets==8.1.2 isoduration==20.11.0 jaraco.classes==3.2.1 jedi==0.19.1 jeepney==0.8.0 Jinja2==3.1.3 joblib==1.3.2 json5==0.9.22 jsonpointer==2.4 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 jupyter==1.0.0 jupyter-console==6.6.3 jupyter-events==0.9.0 jupyter-lsp==2.2.4 jupyter_client==7.4.9 jupyter_core==5.3.1 jupyter_server==2.13.0 jupyter_server_terminals==0.5.2 jupyterlab==4.1.4 jupyterlab-pygments==0.2.2 jupyterlab_server==2.25.4 jupyterlab_widgets==3.0.10 keras==3.1.0 keyring==24.2.0 kiwisolver==1.4.5 language-selector==0.1 launchpadlib==1.11.0 lazr.restfulclient==0.14.5 lazr.uri==1.0.6 libclang==18.1.1 libretranslatepy==2.1.1 lockfile==0.12.2 louis==3.26.0 lxml==5.1.0 Mako==1.2.4.dev0 Markdown==3.6 markdown-it-py==2.2.0 MarkupSafe==2.1.3 matplotlib==3.8.3 matplotlib-inline==0.1.6 mdurl==0.1.2 mistune==3.0.2 ml-dtypes==0.3.2 mock==5.1.0 monotonic==1.6 more-itertools==10.1.0 Morfessor==2.0.6 mosestokenizer==1.2.1 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.16 mutagen==1.46.0 namex==0.0.7 nbclient==0.9.0 nbconvert==7.16.2 nbformat==5.9.2 nest-asyncio==1.5.4 netifaces==0.11.0 networkx==3.2.1 nltk==3.8.1 notebook==7.1.1 notebook_shim==0.2.4 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.19.3 nvidia-nvjitlink-cu12==12.4.99 nvidia-nvtx-cu12==12.1.105 oauthlib==3.2.2 olefile==0.46 omegaconf==2.0.6 openfile==0.0.7 opt-einsum==3.3.0 optree==0.10.0 orjson==3.9.15 overrides==7.7.0 packaging==24.0 pandas==2.2.1 pandocfilters==1.5.1 paramiko==2.12.0 parso==0.8.3 pexpect==4.8.0 Pillow==10.0.0 platformdirs==3.10.0 portalocker==2.8.2 prometheus_client==0.20.0 promise==2.3 prompt-toolkit==3.0.43 protobuf==3.20.3 psutil==5.9.8 ptyprocess==0.7.0 pure-eval==0.2.2 py==1.11.0 pyarrow==15.0.2 pyarrow-hotfix==0.6 pycairo==1.24.0 pycparser==2.21 pycryptodomex==3.11.0 pycups==2.0.1 pydantic==2.6.4 pydantic_core==2.16.3 pydub==0.25.1 Pygments==2.15.1 PyGObject==3.46.0 PyJWT==2.7.0 pylibacl==0.7.0 PyNaCl==1.5.0 pyparsing==3.1.0 PyQt5==5.15.9 PyQt5-sip==12.12.2 python-apt==2.6.0+ubuntu1 python-dateutil==2.8.2 python-debian==0.1.49+ubuntu2 python-json-logger==2.0.7 python-multipart==0.0.9 pytz==2024.1 pyxattr==0.8.1 pyxdg==0.28 PyYAML==6.0.1 pyzmq==24.0.1 qtconsole==5.5.1 QtPy==2.4.1 referencing==0.33.0 regex==2023.12.25 requests==2.31.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==13.3.1 rpds-py==0.18.0 ruff==0.3.3 sacrebleu==2.3.1 sacremoses==0.1.1 safetensors==0.4.2 scikit-learn==1.4.1.post1 scipy==1.12.0 screen-resolution-extra==0.0.0 SecretStorage==3.3.3 semantic-version==2.10.0 Send2Trash==1.8.2 sentencepiece==0.2.0 shellingham==1.5.4 six==1.16.0 sniffio==1.3.1 snowballstemmer==2.2.0 soupsieve==2.5 Sphinx==7.2.6 sphinx-argparse==0.4.0 sphinx-rtd-theme==2.0.0 sphinxcontrib-applehelp==1.0.8 sphinxcontrib-devhelp==1.0.6 sphinxcontrib-htmlhelp==2.0.5 sphinxcontrib-jquery==4.1 sphinxcontrib-jsmath==1.0.1 sphinxcontrib-qthelp==1.0.7 sphinxcontrib-serializinghtml==1.1.10 ssh-import-id==5.11 stack-data==0.6.3 starlette==0.36.3 sympy==1.12 systemd-python==235 tabulate==0.9.0 tensorboard==2.16.2 tensorboard-data-server==0.7.2 tensorflow==2.16.1 tensorflow-addons==0.23.0 tensorflow-datasets==3.2.1 tensorflow-io-gcs-filesystem==0.36.0 tensorflow-metadata==1.14.0 termcolor==2.4.0 terminado==0.18.0 tf2crf==0.1.33 threadpoolctl==3.3.0 tinycss2==1.2.1 tokenizers==0.13.3 tomlkit==0.12.0 toolwrapper==2.1.0 toolz==0.12.1 torch==2.2.1 torchaudio==2.2.1 tornado==6.3.2 tqdm==4.66.2 traitlets==5.14.1 transformers==4.28.1 triton==2.2.0 typeguard==2.13.3 typer==0.9.0 types-python-dateutil==2.8.19.20240311 typing_extensions==4.10.0 tzdata==2024.1 ubuntu-drivers-common==0.0.0 ubuntu-pro-client==8001 uctools==1.3.0 ufw==0.36.2 unattended-upgrades==0.1 urduhack==1.1.1 uri-template==1.3.0 urllib3==1.26.16 usb-creator==0.3.16 uvicorn==0.28.0 virtualenv==20.24.1+ds wadllib==1.3.6 wcwidth==0.2.13 webcolors==1.13 webencodings==0.5.1 websocket-client==1.7.0 websockets==10.4 Werkzeug==3.0.1 widgetsnbextension==4.0.10 wrapt==1.16.0 xdg==5 xkit==0.0.0 xxhash==3.4.1 yarl==1.9.4 yt-dlp==2023.7.6 zipp==1.0.0

  2. I have a virtual env - where I have setup this one. I also set this up in another virtual environment (however I deleted it), no change - same error. I use pip for installation.

  3. I did install everything through notebook, I can try to install by itself as well.

  4. I did not try to ensure Colab and local environment equivalence - I could do that but expectation that every dev instance will be able to replicate Colab environment is not a feasible goal. Are you saying that this notebook is only meant for Google Colab setup and may not work on local Jupyter?

Thanks for sharing the links, I had checked those links, based on those reinstalled, clean installed and then deleted the clean install of the virtual environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants