Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: UNSUPPORTED DTYPE (sockeye-translate --use-cpu --dtype bfloat16) #1084

Open
SamuelLarkin opened this issue Feb 7, 2023 · 0 comments

Comments

@SamuelLarkin
Copy link
Contributor

SamuelLarkin commented Feb 7, 2023

Hi,
following #1083 (comment), I failed to translate using CPU and bfloat16 using pytorch-1.11.0. If I use pytorch-1.13.1, I successfully translate.

It could be something else but with those simple two tests, it looks like that pytorch-1.11.0 is not sufficient. If so, the requirements.txt should reflect that fact.

Command

python -m sockeye.translate --output-type json --batch-size 32 --models ../model --input source.en --use-cpu --dtype bfloat16

Error Message

[INFO:sockeye.utils] Sockeye: 3.1.31, commit 13c63be5e6999102cd8f76065dab618667d54c8d, path /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3
.8/site-packages/sockeye/__init__.py
[INFO:sockeye.utils] PyTorch: 1.11.0 (/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/__init__.py)
[INFO:sockeye.utils] Command: /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py --output-type json --bat
ch-size 32 --models ../model --input source.en --use-cpu --dtype bfloat16
[INFO:sockeye.utils] Arguments: Namespace(batch_size=32, beam_search_stop='all', beam_size=5, brevity_penalty_constant_length_ratio=0.0, brevity_penalty_type='none',
brevity_penalty_weight=1.0, bucket_width=10, checkpoints=None, chunk_size=None, clamp_to_dtype=False, config=None, device_id=0, dtype='bfloat16', ensemble_mode='linear', env=None, greedy=False, input='source.en', input_factors=None, json_input=False, knn_index=None, knn_lambda=0.8, length_penalty_alpha=1.0, length_penalty_beta=0.0
, loglevel='INFO', loglevel_secondary_workers='INFO', max_input_length=None, max_output_length=None, max_output_length_num_stds=2, models=['../model'], nbest_size=1,
no_logfile=False, nvs_thresh=0.5, output=None, output_type='json', prevent_unk=False, quiet=False, quiet_secondary_workers=False, restrict_lexicon=None, restrict_lexicon_topk=None, sample=None, seed=None, skip_nvs=False, strip_unknown_words=False, tf32=True, use_cpu=True)                                                            [INFO:__main__] Translate Device: cpu                                                                                                                                 [INFO:sockeye.model] Loading 1 model(s) from ['../model'] ...
[INFO:sockeye.vocab] Vocabulary (32170 words) loaded from "../model/vocab.src.0.json"
[INFO:sockeye.vocab] Vocabulary (32170 words) loaded from "../model/vocab.trg.0.json"
[INFO:sockeye.model] Model version: 3.1.27
[INFO:sockeye.model] Loaded model config from "../model/config"
[INFO:sockeye.model] Disabling dropout layers for performance reasons
[INFO:sockeye.model] ModelConfig(config_data=DataConfig(data_statistics=DataStatistics(num_sents=18792562, num_discarded=4514, num_tokens_source=396805440, num_tokens
_target=452828161, num_unks_source=151, num_unks_target=150, max_observed_len_source=201, max_observed_len_target=201, size_vocab_source=32170, size_vocab_target=3217
0, length_ratio_mean=1.149165240213179, length_ratio_std=0.3331394866848643, buckets=[(8, 8), (16, 16), (24, 24), (32, 32), (40, 40), (48, 48), (56, 56), (64, 64), (7
2, 72), (80, 80), (88, 88), (96, 96), (104, 104), (112, 112), (120, 120), (128, 128), (136, 136), (144, 144), (152, 152), (160, 160), (168, 168), (176, 176), (184, 18
4), (192, 192), (200, 200), (201, 201)], num_sents_per_bucket=[2488902, 5093385, 3839506, 2546445, 1811528, 1189585, 737992, 443043, 261406, 152182, 89417, 52498, 310
55, 18857, 11863, 7560, 4930, 3414, 2367, 1796, 1381, 1081, 892, 762, 633, 82], average_len_target_per_bucket=[4.876037945069864, 12.773357231261594, 19.5802367533320
52, 27.41336983211651, 35.28823864603507, 43.238106587956544, 51.26582055583915, 59.238620628568825, 67.207513428303, 75.18162933867328, 83.15075619854574, 91.1199173
0707263, 99.07102421459187, 106.99094438055295, 114.94170231821508, 122.88471587159029, 130.96489273583816, 138.7659185760993, 146.50905591051549, 154.6460870395282,
162.2368031213424, 170.14799935605743, 178.83465301964753, 186.0052493281894, 193.85046425851448, 199.38909318975016], length_ratio_stats_per_bucket=[(1.0693756944877
173, 0.2734448342497526), (1.0857894201553209, 0.28019690452817625), (1.1544188404375997, 0.3868604549199259), (1.1861185841999833, 0.3074059313735606), (1.2060657841
545896, 0.2981587515456939), (1.226324650666722, 0.30493095494070144), (1.2444125341378565, 0.3242223370686047), (1.2611266481183327, 0.3709455795374948), (1.27464433
7588064, 0.4302137928506163), (1.2860484970016222, 0.48695434193951453), (1.302799569788393, 0.5653045419192184), (1.3120006329314209, 0.6142970487431451), (1.3295351
237968134, 0.7814292394252162), (1.3384637458091257, 0.8116763474141028), (1.351862242960138, 0.9642116646873813), (1.3368067683991776, 0.7653903732034699), (1.367075
2245352829, 0.9719727938185959), (1.3805636470652694, 1.0975590088160094), (1.3476927572822692, 0.696317634165507), (1.3496332871268524, 0.7573960914043955), (1.30467
33705213736, 0.6783789455528596), (1.3753328246704346, 1.6470598091351123), (1.3040674746204497, 1.059827519965373), (1.253641535651391, 0.5013375061317442), (1.24804
87830675664, 0.3590853095382778), (1.2543975958596236, 0.31963245113954747)]), max_seq_len_source=201, max_seq_len_target=201, num_source_factors=1, num_target_factor
s=1), vocab_source_size=32170, vocab_target_size=32170, config_embed_source=EmbeddingConfig(vocab_size=32170, num_embed=1024, dropout=0.0, num_factors=1, factor_confi
gs=None, allow_sparse_grad=False), config_embed_target=EmbeddingConfig(vocab_size=32170, num_embed=1024, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse
_grad=False), config_encoder=TransformerConfig(model_size=1024, attention_heads=16, feed_forward_num_hidden=4096, act_type='relu', num_layers=6, dropout_attention=0.0
, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=201, max_seq_len_tar
get=201, decoder_type='transformer', use_lhuc=False, depth_key_value=1024, use_glu=False), config_decoder=TransformerConfig(model_size=1024, attention_heads=16, feed_
forward_num_hidden=4096, act_type='relu', num_layers=6, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_seq
uence='n', postprocess_sequence='dr', max_seq_len_source=201, max_seq_len_target=201, decoder_type='transformer', use_lhuc=False, depth_key_value=1024, use_glu=False)
, config_length_task=None, weight_tying_type='src_trg_softmax', lhuc=False, dtype='float32', neural_vocab_selection=None, neural_vocab_selection_block_loss=False)
[INFO:sockeye.model] Loaded params from "../model/params.best" to "cpu"
[INFO:sockeye.model] Casting SockeyeModel to dtype torch.bfloat16
[INFO:sockeye.model] Model dtype: overridden to bfloat16
[INFO:sockeye.model] 1 model(s) loaded in 7.1540s
[INFO:sockeye.inference] Translator (1 model(s) beam_size=5 algorithm=BeamSearch, beam_search_stop=all max_input_length=200 nbest_size=1 ensemble_mode=None max_batch_size=32 dtype=torch.bfloat16 skip_nvs=False nvs_thresh=0.5)
[INFO:__main__] Translating...
/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/jit/_trace.py:958: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.
  module._c._create_method_from_trace(
[ERROR:root] Uncaught exception
Traceback (most recent call last):
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 264, in <module>
    main()
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 42, in main
    run_translate(args)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 146, in run_translate
    read_and_translate(translator=translator,
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 232, in read_and_translate
    chunk_time = translate(output_handler, chunk, translator)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 255, in translate
    trans_outputs = translator.translate(trans_inputs)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/inference.py", line 943, in translate
    batch_translations = self._translate_np(*self._get_inference_input(translator_inputs))
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/inference.py", line 1184, in _translate_np
    return self._get_best_translations(self._search(source,
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/beam_search.py", line 1047, in forward
    lengths, estimated_reference_lengths = self._traced_sort_norm_and_update_finished(*_sort_inputs)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
RuntimeError: UNSUPPORTED DTYPE

Conda Env Export

name: sockeye-3.1.31
channels:
  - pytorch
  - nvidia
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - blas=1.0=mkl
  - bzip2=1.0.8=h7b6447c_0
  - ca-certificates=2023.01.10=h06a4308_0
  - certifi=2022.12.7=py310h06a4308_0
  - cuda=11.6.1=0
  - cuda-cccl=11.6.55=hf6102b2_0
  - cuda-command-line-tools=11.6.2=0
  - cuda-compiler=11.6.2=0
  - cuda-cudart=11.6.55=he381448_0
  - cuda-cudart-dev=11.6.55=h42ad0f4_0
  - cuda-cuobjdump=11.6.124=h2eeebcb_0
  - cuda-cupti=11.6.124=h86345e5_0
  - cuda-cuxxfilt=11.6.124=hecbf4f6_0
  - cuda-driver-dev=11.6.55=0
  - cuda-gdb=12.0.90=0
  - cuda-libraries=11.6.1=0
  - cuda-libraries-dev=11.6.1=0
  - cuda-memcheck=11.8.86=0
  - cuda-nsight=12.0.78=0
  - cuda-nsight-compute=12.0.0=0
  - cuda-nvcc=11.6.124=hbba6d2d_0
  - cuda-nvdisasm=12.0.76=0
  - cuda-nvml-dev=11.6.55=haa9ef22_0
  - cuda-nvprof=12.0.90=0
  - cuda-nvprune=11.6.124=he22ec0a_0
  - cuda-nvrtc=11.6.124=h020bade_0
  - cuda-nvrtc-dev=11.6.124=h249d397_0
  - cuda-nvtx=11.6.124=h0630a44_0
  - cuda-nvvp=12.0.90=0
  - cuda-runtime=11.6.1=0
  - cuda-samples=11.6.101=h8efea70_0
  - cuda-sanitizer-api=12.0.90=0
  - cuda-toolkit=11.6.1=0
  - cuda-tools=11.6.1=0
  - cuda-visual-tools=11.6.1=0
  - flit-core=3.6.0=pyhd3eb1b0_0
  - gds-tools=1.5.0.59=0
  - intel-openmp=2022.1.0=h9e868ea_3769
  - ld_impl_linux-64=2.38=h1181459_1
  - libcublas=11.9.2.110=h5e84587_0
  - libcublas-dev=11.9.2.110=h5c901ab_0
  - libcufft=10.7.1.112=hf425ae0_0
  - libcufft-dev=10.7.1.112=ha5ce4c0_0
  - libcufile=1.5.0.59=0
  - libcufile-dev=1.5.0.59=0
  - libcurand=10.3.1.50=0
  - libcurand-dev=10.3.1.50=0
  - libcusolver=11.3.4.124=h33c3c4e_0
  - libcusparse=11.7.2.124=h7538f96_0
  - libcusparse-dev=11.7.2.124=hbbe9722_0
  - libffi=3.4.2=h6a678d5_6
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libnpp=11.6.3.124=hd2722f0_0
  - libnpp-dev=11.6.3.124=h3c42840_0
  - libnvjpeg=11.6.2.124=hd473ad6_0
  - libnvjpeg-dev=11.6.2.124=hb5906b9_0
  - libstdcxx-ng=11.2.0=h1234567_1
  - libuuid=1.41.5=h5eee18b_0
  - mkl=2022.1.0=hc2b9512_224
  - ncurses=6.3=h5eee18b_3
  - nsight-compute=2022.4.0.15=0
  - openssl=1.1.1s=h7f8727e_0
  - pip=22.3.1=py310h06a4308_0
  - python=3.10.9=h7a1cb2a_0
  - pytorch=1.13.1=py3.10_cuda11.6_cudnn8.3.2_0
  - pytorch-cuda=11.6=h867d48c_1
  - pytorch-mutex=1.0=cuda
  - readline=8.2=h5eee18b_0
  - setuptools=65.6.3=py310h06a4308_0
  - sqlite=3.40.1=h5082296_0
  - tk=8.6.12=h1ccaba5_0
  - typing_extensions=4.4.0=py310h06a4308_0
  - tzdata=2022g=h04d1e81_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.10=h5eee18b_1
  - zlib=1.2.13=h5eee18b_0
  - pip:
      - aiohttp==3.8.3
      - aiosignal==1.3.1
      - async-timeout==4.0.2
      - attrs==22.2.0
      - charset-normalizer==2.1.1
      - codetiming==1.4.0
      - colorama==0.4.6
      - datasets==2.8.0
      - dill==0.3.6
      - filelock==3.9.0
      - frozenlist==1.3.3
      - fsspec==2023.1.0
      - huggingface-hub==0.11.1
      - idna==3.4
      - joblib==1.2.0
      - lxml==4.9.2
      - multidict==6.0.4
      - multiprocess==0.70.14
      - numpy==1.24.1
      - packaging==23.0
      - pandas==1.5.3
      - portalocker==2.7.0
      - py-spy==0.3.14
      - pyarrow==10.0.1
      - python-dateutil==2.8.2
      - pytz==2022.7.1
      - pyyaml==6.0
      - regex==2022.10.31
      - requests==2.28.2
      - responses==0.18.0
      - sacrebleu==2.3.1
      - scikit-learn==1.2.0
      - scipy==1.10.0
      - six==1.16.0
      - sockeye==3.1.31
      - tabulate==0.9.0
      - threadpoolctl==3.1.0
      - tokenizers==0.12.1
      - tqdm==4.64.1
      - transformers==4.20.1
      - urllib3==1.26.14
      - xxhash==3.2.0
      - yarl==1.8.2
prefix: /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant