Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[converter] Failed to convert keras model into ONNX format #234

Open
univerone opened this issue Feb 28, 2021 · 0 comments
Open

[converter] Failed to convert keras model into ONNX format #234

univerone opened this issue Feb 28, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@univerone
Copy link
Collaborator

univerone commented Feb 28, 2021

Software and Hardware Versions

  • ModelCI version latest
  • CUDA Version v10.2
  • GPU device used: true

Problem description

Failed to optimize ONNX model which is converted from a keras model

2021-02-28 16:25:40.416520: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-02-28 16:25:40.441655: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-02-28 16:25:40.442280: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce MX110 computeCapability: 5.0
coreClock: 1.006GHz coreCount: 2 deviceMemorySize: 1.96GiB deviceMemoryBandwidth: 37.33GiB/s
2021-02-28 16:25:40.442366: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-02-28 16:25:40.443859: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-02-28 16:25:40.445242: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-02-28 16:25:40.445510: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-02-28 16:25:40.447141: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-02-28 16:25:40.447996: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-02-28 16:25:40.448070: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-02-28 16:25:40.448079: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-02-28 16:25:40.448291: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-02-28 16:25:40.453816: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 1800500000 Hz
2021-02-28 16:25:40.454180: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f5674000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-02-28 16:25:40.454196: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-02-28 16:25:40.454257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-02-28 16:25:40.454265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      
tf executing eager_mode: True
tf.keras model eager_mode: False
The ONNX operator number change on the optimization: 458 -> 127
2021-02-28 16:25:54,773 - converter - INFO - Begin Simplify ONNX Model ...
2021-02-28 16:25:54,773 - converter - INFO - Begin Simplify ONNX Model ...
2021-02-28 16:25:54,773 - converter - INFO - Begin Simplify ONNX Model ...
2021-02-28 16:25:54,773 - converter - INFO - Begin Simplify ONNX Model ...
Traceback (most recent call last):
  File "~/ML-Model-CI/run_test.py", line 14, in <module>
    onnx_model = ONNXConverter.from_keras(keras_model)
  File "~/ML-Model-CI/modelci/hub/converter/onnx/converter.py", line 66, in wrap
    onnx_model = ONNXConverter.optim_onnx(onnx_model)
  File "~/ML-Model-CI/modelci/hub/converter/onnx/converter.py", line 224, in optim_onnx
    model = optimizer.optimize(model, passes)
  File "~/miniconda3/envs/modelci/lib/python3.8/site-packages/onnx/optimizer.py", line 55, in optimize
    optimized_model_str = C.optimize(model_str, passes)
IndexError: Input conv1_conv_W_new is undefined!

Process finished with exit code 1

Steps to Reproduce the Problem

from modelci.hub.converter.onnx import ONNXConverter
import tensorflow as tf

if __name__ == '__main__':
    keras_model = tf.keras.applications.ResNet50()
    onnx_model = ONNXConverter.from_keras(keras_model)

Expected Behavior

Other Information

Here is a revelant issue onnx/keras-onnx#337
Maybe we should remove optimize process after the conversion of keras model to onnx model

@univerone univerone added the bug Something isn't working label Feb 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant