Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DIGITS 6 docker - Can't train MNIST demo #2237

Open
guillebot opened this issue Sep 1, 2020 · 0 comments
Open

DIGITS 6 docker - Can't train MNIST demo #2237

guillebot opened this issue Sep 1, 2020 · 0 comments

Comments

@guillebot
Copy link

guillebot commented Sep 1, 2020

Hello there

I've been fighting this for a couple of days and no luck.

I have a clean ubuntu 19.04 (and 20.04, same on both) installs.

Docker 19.03.6

digits-20.08-tensorflow-py3

I can load MNIST dataset, but when I try to train the model, as specified in the getting started doc:

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:743: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.
2020-09-01 14:32:03 [WARNING] From /opt/digits/digits/tools/tensorflow_tools/main.py:743: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.
I0901 14:32:03.447427 140333568571200 main.py:415] Train batch size is 16 and validation batch size is 16
I0901 14:32:03.447491 140333568571200 main.py:419] Training epochs to be completed for each validation : 1
I0901 14:32:03.447914 140333568571200 main.py:423] Training epochs to be completed before taking a snapshot : 1.0
I0901 14:32:03.448702 140333568571200 main.py:427] Model weights will be saved as snapshot_<EPOCH>_Model.ckpt
I0901 14:32:03.448931 140333568571200 main.py:440] Loading mean tensor from /workspace/jobs/20200901-143034-bb04/mean.binaryproto file
I0901 14:32:03.450155 140333568571200 main.py:446] Loading label definitions from /workspace/jobs/20200901-143034-bb04/labels.txt file
I0901 14:32:03.450254 140333568571200 main.py:452] Found 10 classes
I0901 14:32:03.460045 140333568571200 tf_data.py:221] Found 44999 images in db /workspace/jobs/20200901-143034-bb04/train_db
2020-09-01 14:32:03.558992: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2799925000 Hz
2020-09-01 14:32:03.559640: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4ca9c40 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-01 14:32:03.559649: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-01 14:32:03.560947: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-01 14:32:03.634859: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:03.635218: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x52db5b0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-01 14:32:03.635228: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 SUPER, Compute Capability 7.5
2020-09-01 14:32:03.635458: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:03.635733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
2020-09-01 14:32:03.635748: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-09-01 14:32:03.640595: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-09-01 14:32:03.643824: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-01 14:32:03.644758: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-01 14:32:03.649951: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-01 14:32:03.651165: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.11
2020-09-01 14:32:03.651248: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-01 14:32:03.651301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:03.651604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:03.651859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-09-01 14:32:03.652041: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-09-01 14:32:04.177699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-01 14:32:04.177720: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181]      0
2020-09-01 14:32:04.177724: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0:   N
2020-09-01 14:32:04.178016: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.178343: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.178635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/device:GPU:0 with 6552 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:156: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
W0901 14:32:04.179302 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:156: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:160: The name tf.get_variable_scope is deprecated. Please use tf.compat.v1.get_variable_scope instead.
W0901 14:32:04.224908 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:160: The name tf.get_variable_scope is deprecated. Please use tf.compat.v1.get_variable_scope instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:300: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.
W0901 14:32:04.228342 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:300: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:300: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.
W0901 14:32:04.228421 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:300: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:168: The name tf.add_to_collection is deprecated. Please use tf.compat.v1.add_to_collection instead.
W0901 14:32:04.228488 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:168: The name tf.add_to_collection is deprecated. Please use tf.compat.v1.add_to_collection instead.
I0901 14:32:04.229444 140333568571200 model.py:265] Optimizer:sgd
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:267: The name tf.train.GradientDescentOptimizer is deprecated. Please use tf.compat.v1.train.GradientDescentOptimizer instead.
W0901 14:32:04.229521 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:267: The name tf.train.GradientDescentOptimizer is deprecated. Please use tf.compat.v1.train.GradientDescentOptimizer instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:259: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
W0901 14:32:04.229628 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:259: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:251: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
W0901 14:32:04.265417 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:251: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
I0901 14:32:04.274517 140333568571200 tf_data.py:221] Found 15001 images in db /workspace/jobs/20200901-143034-bb04/val_db
2020-09-01 14:32:04.306558: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.306891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
2020-09-01 14:32:04.306907: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-09-01 14:32:04.306926: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-09-01 14:32:04.306934: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-01 14:32:04.306941: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-01 14:32:04.306949: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-01 14:32:04.306956: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.11
2020-09-01 14:32:04.306964: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-01 14:32:04.307001: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.307274: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.307520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-09-01 14:32:04.307535: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-01 14:32:04.307538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181]      0
2020-09-01 14:32:04.307541: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0:   N
2020-09-01 14:32:04.307586: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.307860: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.308111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/device:GPU:0 with 6552 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:518: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
W0901 14:32:04.327086 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:518: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:524: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
W0901 14:32:04.327216 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:524: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2020-09-01 14:32:04.327445: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.327703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
2020-09-01 14:32:04.327713: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-09-01 14:32:04.327723: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-09-01 14:32:04.327730: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-01 14:32:04.327736: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-01 14:32:04.327743: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-01 14:32:04.327749: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.11
2020-09-01 14:32:04.327756: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-01 14:32:04.327786: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.328058: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.328304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-09-01 14:32:04.328313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-01 14:32:04.328316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181]      0
2020-09-01 14:32:04.328318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0:   N
2020-09-01 14:32:04.328360: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.328633: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:985] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-01 14:32:04.328884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6552 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:538: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
W0901 14:32:04.329487 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:538: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:541: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.
W0901 14:32:04.336099 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:541: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:541: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.
W0901 14:32:04.336723 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:541: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:549: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.
W0901 14:32:04.432924 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:549: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/main.py:557: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
W0901 14:32:04.498357 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/main.py:557: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
I0901 14:32:04.510822 140333568571200 model.py:203] Starting queue runners (val)
I0901 14:32:04.512537 140333568571200 model.py:213] Queue runners started (val)
WARNING:tensorflow:From /opt/digits/digits/tools/tensorflow_tools/model.py:245: The name tf.summary.merge is deprecated. Please use tf.compat.v1.summary.merge instead.
W0901 14:32:04.512653 140333568571200 module_wrapper.py:139] From /opt/digits/digits/tools/tensorflow_tools/model.py:245: The name tf.summary.merge is deprecated. Please use tf.compat.v1.summary.merge instead.
2020-09-01 14:32:04.649822: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-09-01 14:32:05.020839: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-09-01 14:32:06.003824: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at conv_ops.cc:1006 : Not found: No algorithm worked!
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: No algorithm worked!
[[{{node val/model/conv1/Conv2D}}]]
(1) Not found: No algorithm worked!
[[{{node val/model/conv1/Conv2D}}]]
[[val/model/loss/accuracy/_11]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/digits/digits/tools/tensorflow_tools/main.py", line 743, in <module>
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/opt/digits/digits/tools/tensorflow_tools/main.py", line 566, in main
Validation(sess, val_model, 0)
File "/opt/digits/digits/tools/tensorflow_tools/main.py", line 376, in Validation
summary_str = sess.run(model.summary)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: No algorithm worked!
[[node val/model/conv1/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Not found: No algorithm worked!
[[node val/model/conv1/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[val/model/loss/accuracy/_11]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'val/model/conv1/Conv2D':
File "/opt/digits/digits/tools/tensorflow_tools/main.py", line 743, in <module>
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/opt/digits/digits/tools/tensorflow_tools/main.py", line 505, in main
val_model.create_model(UserModel, stage_scope)  # noqa
File "/opt/digits/digits/tools/tensorflow_tools/model.py", line 157, in create_model
tower_model.inference  # touch to initialize
File "/opt/digits/digits/tools/tensorflow_tools/utils.py", line 37, in decorator
setattr(self, attribute, function(self))
File "<string>", line 18, in inference
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/layers/python/layers/layers.py", line 1162, in convolution2d
conv_dims=2)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/layers/python/layers/layers.py", line 1060, in convolution
outputs = layer.apply(inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 330, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py", line 1700, in apply
return self.__call__(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/layers/base.py", line 548, in __call__
outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py", line 854, in __call__
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 234, in wrapper
return converted_call(f, options, args, kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 439, in converted_call
return _call_unconverted(f, args, kwargs, options)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 330, in _call_unconverted
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/layers/convolutional.py", line 201, in call
outputs = self._convolution_op(inputs, self.kernel)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py", line 1176, in __call__
return self.conv_op(inp, filter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py", line 662, in __call__
return self.call(inp, filter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py", line 252, in __call__
name=self.name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py", line 2052, in conv2d
name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py", line 1071, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()```

What am I doing wrong?

Thanks a lot guys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant