Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

name 'M' is not defined, seems M = Model(D_train.get_metadata()) encounter error #32

Open
no7dw opened this issue May 7, 2020 · 0 comments

Comments

@no7dw
Copy link

no7dw commented May 7, 2020

when I run unit test in docker (cpu ver.), it reports an error:

root@85a655cc87d1:/app/codalab# python run_local_test.py
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Begin running local test using
2020-05-07 06:36:42 INFO run_local_test.py: code_dir = AutoDL_sample_code_submission
2020-05-07 06:36:42 INFO run_local_test.py: dataset_dir = miniciao
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_sample_result_submission
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_scoring_output
python /app/codalab/AutoDL_ingestion_program/ingestion.py --dataset_dir=/app/codalab/AutoDL_sample_data/miniciao --code_dir=/app/codalab/AutoDL_sample_code_submission --time_budget=1200.0
python /app/codalab/AutoDL_scoring_program/score.py --solution_dir=/app/codalab/AutoDL_sample_data/miniciao
2020-05-07 06:36:43,653 INFO score.py: ===== Start scoring program. Version: v20191204 =====
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: ******** Processing dataset Miniciao ********
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: Reading training set and test set...
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/tensor_array_ops.py:162: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-05-07 06:36:44,928 INFO ingestion.py: Creating model...this process should not exceed 20min.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 19, in <lambda>
    threading.Thread(target=lambda: torch.cuda.synchronize()),
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 398, in synchronize
    _lazy_init()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

2020-05-07 06:36:46,014 INFO ingestion.py: Initialization success, time spent so far 1.0854098796844482 sec
2020-05-07 06:36:46,014 ERROR ingestion.py: Failed to initializing model.
2020-05-07 06:36:46,015 ERROR ingestion.py: Encountered exception:
Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 208, in time_limit
    yield
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/app/codalab/AutoDL_sample_code_submission/model.py", line 54, in __init__
    self.domain_model = DomainModel(self.metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 42, in __init__
    super(Model, self).__init__(metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/skeleton/projects/logic.py", line 88, in __init__
    self.build()
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 66, in build
    self.model_9.init(model_dir=model_path, gain=1.0)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/architectures/resnet.py", line 244, in init
    model_dir=self.model_dir)
  File "/usr/local/lib/python3.5/dist-packages/torch/hub.py", line 499, in load_state_dict_from_url
    return torch.load(cached_file, map_location=map_location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 613, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 576, in persistent_load
    deserialized_objects[root_key] = restore_location(obj, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 155, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 131, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 115, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
2020-05-07 06:36:46,035 INFO ingestion.py: ===== Start core part of ingestion program. Version: v20191204 =====
2020-05-07 06:36:46,039 INFO ingestion.py: Failed to run ingestion.
2020-05-07 06:36:46,039 ERROR ingestion.py: Encountered exception:
name 'M' is not defined
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 358, in <module>
    if not hasattr(M, attr):
NameError: name 'M' is not defined
2020-05-07 06:36:46,044 INFO ingestion.py: Wrote the file end.txt marking the end of ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Done, but encountered some errors during ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Overall time spent  0.01 sec
2020-05-07 06:36:46,079 INFO ingestion.py: [Ingestion terminated]

first I thought it was an netowrk issue during download training data, but I tried run test with proxy, orI downloaded the the r9-xxx.pth.tar , even after build with another machine (with docker of course) still without luck.

It's weird that log report :

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

which I'm using docker of cpu ver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant