You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tensorflow v2.4.0 (installed through pip)
Also tested on TF v2.5.0
2. Code
from tensorflow.keras import layers
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow import keras
import tensorflow as tf
import os
import numpy as np
def representative_dataset():
for data in tf.data.Dataset.from_tensor_slices((x_train)).batch(1).take(100):
yield [data]
model definition
input = keras.Input(shape=(28, 28), dtype=tf.uint8)
x = tf.cast(input, dtype=tf.float32)
x = tf.expand_dims(x, -1)
x = layers.Conv2D(32, 3, activation='relu', padding="valid")(x)
x = layers.Conv2D(32, 5, activation='relu', padding="valid")(x)
x = layers.Flatten()(x)
x = layers.Dense(10)(x)
x = layers.Softmax()(x)
model = keras.models.Model(input, x)
The model is converted successfully and is able to be inspected via e.g. Netron. However, when running inference, the model throws a segmentation fault.
The segmentation fault is solved when not quantizing the model, but that is not an option for me.
4. (optional) Any other info / logs
The issue persists when changing the first kernel_size to 1, or the second kernel_size to >5.
The issue vanishes when using kernel_size 3 for all layers.
The issue comes back when adding padding="same" to both layers with kernel_size=3.
The issue vanishes when just using one layer, or if e.g. MaxPool2D is used between the Conv2D layers.
When calling "gdb --args python tflite-test.py", the output is:
Starting program: tflite-test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
2021-09-29 09:40:17.586419: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Spawning and exiting threads
[New Thread 0x7fffcc3e8700 (LWP 4629)]
[New Thread 0x7fffcbbe7700 (LWP 4630)]
[New Thread 0x7fffc33e6700 (LWP 4631)]
[New Thread 0x7fffbabe5700 (LWP 4632)]
[New Thread 0x7fffb23e4700 (LWP 4633)]
[New Thread 0x7fffa9be3700 (LWP 4634)]
[New Thread 0x7fffa13e2700 (LWP 4635)]
[New Thread 0x7fff98be1700 (LWP 4636)]
[New Thread 0x7fff883e0700 (LWP 4637)]
[New Thread 0x7fff7fbdf700 (LWP 4638)]
[New Thread 0x7fff7f3de700 (LWP 4639)]
[New Thread 0x7fff6ebdd700 (LWP 4640)]
[New Thread 0x7fff6e3dc700 (LWP 4641)]
[New Thread 0x7fff65bdb700 (LWP 4642)]
[New Thread 0x7fff5d3da700 (LWP 4643)]
[New Thread 0x7fff54bd9700 (LWP 4644)]
[New Thread 0x7fff4c3d8700 (LWP 4645)]
[New Thread 0x7fff3bbd7700 (LWP 4646)]
[New Thread 0x7fff333d6700 (LWP 4647)]
[Thread 0x7fff5d3da700 (LWP 4643) exited]
[Thread 0x7fff333d6700 (LWP 4647) exited]
[Thread 0x7fff3bbd7700 (LWP 4646) exited]
[Thread 0x7fff4c3d8700 (LWP 4645) exited]
[Thread 0x7fff54bd9700 (LWP 4644) exited]
[Thread 0x7fff65bdb700 (LWP 4642) exited]
[Thread 0x7fff6e3dc700 (LWP 4641) exited]
[Thread 0x7fff6ebdd700 (LWP 4640) exited]
[Thread 0x7fff7f3de700 (LWP 4639) exited]
[Thread 0x7fff7fbdf700 (LWP 4638) exited]
[Thread 0x7fff883e0700 (LWP 4637) exited]
[Thread 0x7fff98be1700 (LWP 4636) exited]
[Thread 0x7fffa13e2700 (LWP 4635) exited]
[Thread 0x7fffa9be3700 (LWP 4634) exited]
[Thread 0x7fffb23e4700 (LWP 4633) exited]
[Thread 0x7fffbabe5700 (LWP 4632) exited]
[Thread 0x7fffc33e6700 (LWP 4631) exited]
[Thread 0x7fffcbbe7700 (LWP 4630) exited]
[Thread 0x7fffcc3e8700 (LWP 4629) exited]
[New Thread 0x7fff333d6700 (LWP 4675)]
[New Thread 0x7fff3bbd7700 (LWP 4676)]
[New Thread 0x7fff4c3d8700 (LWP 4677)]
[New Thread 0x7fff54bd9700 (LWP 4678)]
[New Thread 0x7fff1359e700 (LWP 4679)]
[New Thread 0x7fff10d9d700 (LWP 4681)]
[New Thread 0x7fff0e59c700 (LWP 4682)]
[New Thread 0x7fff0bd9b700 (LWP 4683)]
[New Thread 0x7fff0959a700 (LWP 4684)]
[New Thread 0x7fff04d99700 (LWP 4685)]
[New Thread 0x7fff02598700 (LWP 4686)]
[New Thread 0x7ffeffd97700 (LWP 4687)]
[New Thread 0x7ffefd596700 (LWP 4688)]
[New Thread 0x7ffefad95700 (LWP 4689)]
[New Thread 0x7ffef8594700 (LWP 4690)]
[New Thread 0x7ffef5d93700 (LWP 4691)]
[New Thread 0x7ffef3592700 (LWP 4692)]
[New Thread 0x7ffef0d91700 (LWP 4693)]
[New Thread 0x7ffeee590700 (LWP 4694)]
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff7a9d476 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
When calling "bt", the output is
Backtrace of SIGSEGV
#0 0x00007ffff7a9d476 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
tensorflow/tensorflow#1 0x00007fff1d290452 in void tflite::optimized_ops::Im2col(tflite::ConvParams const&, int, int, unsigned char, tflite::RuntimeShape const&, signed char const*, tflite::RuntimeShape const&, signed char*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#2 0x00007fff1d2c40c0 in tflite::optimized_integer_ops::ConvPerChannel(tflite::ConvParams const&, int const*, int const*, tflite::RuntimeShape const&, signed char const*, tflite::RuntimeShape const&, signed char const*, tflite::RuntimeShape const&, int const*, tflite::RuntimeShape const&, signed char*, tflite::RuntimeShape const&, signed char*, tflite::CpuBackendContext*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#3 0x00007fff1d2c43d2 in void tflite::ops::builtin::conv::EvalQuantizedPerChannel<(tflite::ops::builtin::conv::KernelType)2>(TfLiteContext*, TfLiteNode*, TfLiteConvParams*, tflite::ops::builtin::conv::OpData*, TfLiteTensor const*, TfLiteTensor const*, TfLiteTensor const*, TfLiteTensor*, TfLiteTensor*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#4 0x00007fff1d2c460f in TfLiteStatus tflite::ops::builtin::conv::EvalImpl<(tflite::ops::builtin::conv::KernelType)2, (TfLiteType)9>(TfLiteContext*, TfLiteNode*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#5 0x00007fff1d2d2992 in TfLiteStatus tflite::ops::builtin::conv::Eval<(tflite::ops::builtin::conv::KernelType)2>(TfLiteContext*, TfLiteNode*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#6 0x00007fff1d4c5403 in tflite::Subgraph::Invoke() ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#7 0x00007fff1d4c7eb0 in tflite::Interpreter::Invoke() ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#8 0x00007fff1d212bb8 in tflite::interpreter_wrapper::InterpreterWrapper::Invoke() ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#10 0x00007fff1d2066f2 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#11 0x00000000005ff286 in PyCFunction_Call ()
---Type to continue, or q to quit---
@davidson1, @jvishnuvardhan !I tried to replicate to this in Colab environment , Issue was not replicating in Colab environment though.providing GIST in TF 2.5 ,2.6 and 2.7 for reference .
1. System information
Operating System: Ubuntu 18.04.5 LTS
Kernel: Linux 5.4.0-60-generic
Architecture: x86-64
GPU: 2x Nvidia Quadro RTX8000
cuda: v11.0
Tensorflow v2.4.0 (installed through pip)
Also tested on TF v2.5.0
2. Code
from tensorflow.keras import layers
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow import keras
import tensorflow as tf
import os
import numpy as np
split data between train and test
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
def representative_dataset():
for data in tf.data.Dataset.from_tensor_slices((x_train)).batch(1).take(100):
yield [data]
model definition
input = keras.Input(shape=(28, 28), dtype=tf.uint8)
x = tf.cast(input, dtype=tf.float32)
x = tf.expand_dims(x, -1)
x = layers.Conv2D(32, 3, activation='relu', padding="valid")(x)
x = layers.Conv2D(32, 5, activation='relu', padding="valid")(x)
x = layers.Flatten()(x)
x = layers.Dense(10)(x)
x = layers.Softmax()(x)
model = keras.models.Model(input, x)
model.compile(loss=SparseCategoricalCrossentropy())
model.fit(x_train, y_train, epochs=1)
convert model to TFLite
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
tflite_quant_model = converter.convert()
if not os.path.exists("TFLite_models"):
os.mkdir("TFLite_models")
f = open('TFLite_models/model.tflite', "wb")
f.write(tflite_quant_model)
f.close()
Seperate file for debugging purposes (tflite-test.py)
TFLite inference
interpreter = tf.lite.Interpreter("../shell/TFLite_models/model.tflite")
interpreter.resize_tensor_input(0, x_test.shape)
interpreter.allocate_tensors()
interpreter.set_tensor(0, x_test)
interpreter.invoke()
output_details = interpreter.get_output_details()
prediction = interpreter.get_tensor(output_details[0]['index'])
print("Test accuracy: ", np.count_nonzero(y_test == prediction.argmax(axis=-1))/len(y_test))
3. Failure after conversion
The model is converted successfully and is able to be inspected via e.g. Netron. However, when running inference, the model throws a segmentation fault.
The segmentation fault is solved when not quantizing the model, but that is not an option for me.
4. (optional) Any other info / logs
The issue persists when changing the first kernel_size to 1, or the second kernel_size to >5.
The issue vanishes when using kernel_size 3 for all layers.
The issue comes back when adding padding="same" to both layers with kernel_size=3.
The issue vanishes when just using one layer, or if e.g. MaxPool2D is used between the Conv2D layers.
When calling "gdb --args python tflite-test.py", the output is:
Starting program: tflite-test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
2021-09-29 09:40:17.586419: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Spawning and exiting threads
[New Thread 0x7fffcc3e8700 (LWP 4629)][New Thread 0x7fffcbbe7700 (LWP 4630)]
[New Thread 0x7fffc33e6700 (LWP 4631)]
[New Thread 0x7fffbabe5700 (LWP 4632)]
[New Thread 0x7fffb23e4700 (LWP 4633)]
[New Thread 0x7fffa9be3700 (LWP 4634)]
[New Thread 0x7fffa13e2700 (LWP 4635)]
[New Thread 0x7fff98be1700 (LWP 4636)]
[New Thread 0x7fff883e0700 (LWP 4637)]
[New Thread 0x7fff7fbdf700 (LWP 4638)]
[New Thread 0x7fff7f3de700 (LWP 4639)]
[New Thread 0x7fff6ebdd700 (LWP 4640)]
[New Thread 0x7fff6e3dc700 (LWP 4641)]
[New Thread 0x7fff65bdb700 (LWP 4642)]
[New Thread 0x7fff5d3da700 (LWP 4643)]
[New Thread 0x7fff54bd9700 (LWP 4644)]
[New Thread 0x7fff4c3d8700 (LWP 4645)]
[New Thread 0x7fff3bbd7700 (LWP 4646)]
[New Thread 0x7fff333d6700 (LWP 4647)]
[Thread 0x7fff5d3da700 (LWP 4643) exited]
[Thread 0x7fff333d6700 (LWP 4647) exited]
[Thread 0x7fff3bbd7700 (LWP 4646) exited]
[Thread 0x7fff4c3d8700 (LWP 4645) exited]
[Thread 0x7fff54bd9700 (LWP 4644) exited]
[Thread 0x7fff65bdb700 (LWP 4642) exited]
[Thread 0x7fff6e3dc700 (LWP 4641) exited]
[Thread 0x7fff6ebdd700 (LWP 4640) exited]
[Thread 0x7fff7f3de700 (LWP 4639) exited]
[Thread 0x7fff7fbdf700 (LWP 4638) exited]
[Thread 0x7fff883e0700 (LWP 4637) exited]
[Thread 0x7fff98be1700 (LWP 4636) exited]
[Thread 0x7fffa13e2700 (LWP 4635) exited]
[Thread 0x7fffa9be3700 (LWP 4634) exited]
[Thread 0x7fffb23e4700 (LWP 4633) exited]
[Thread 0x7fffbabe5700 (LWP 4632) exited]
[Thread 0x7fffc33e6700 (LWP 4631) exited]
[Thread 0x7fffcbbe7700 (LWP 4630) exited]
[Thread 0x7fffcc3e8700 (LWP 4629) exited]
[New Thread 0x7fff333d6700 (LWP 4675)]
[New Thread 0x7fff3bbd7700 (LWP 4676)]
[New Thread 0x7fff4c3d8700 (LWP 4677)]
[New Thread 0x7fff54bd9700 (LWP 4678)]
[New Thread 0x7fff1359e700 (LWP 4679)]
[New Thread 0x7fff10d9d700 (LWP 4681)]
[New Thread 0x7fff0e59c700 (LWP 4682)]
[New Thread 0x7fff0bd9b700 (LWP 4683)]
[New Thread 0x7fff0959a700 (LWP 4684)]
[New Thread 0x7fff04d99700 (LWP 4685)]
[New Thread 0x7fff02598700 (LWP 4686)]
[New Thread 0x7ffeffd97700 (LWP 4687)]
[New Thread 0x7ffefd596700 (LWP 4688)]
[New Thread 0x7ffefad95700 (LWP 4689)]
[New Thread 0x7ffef8594700 (LWP 4690)]
[New Thread 0x7ffef5d93700 (LWP 4691)]
[New Thread 0x7ffef3592700 (LWP 4692)]
[New Thread 0x7ffef0d91700 (LWP 4693)]
[New Thread 0x7ffeee590700 (LWP 4694)]
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff7a9d476 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
When calling "bt", the output is
Backtrace of SIGSEGV
#0 0x00007ffff7a9d476 in ?? () from /lib/x86_64-linux-gnu/libc.so.6tensorflow/tensorflow#1 0x00007fff1d290452 in void tflite::optimized_ops::Im2col(tflite::ConvParams const&, int, int, unsigned char, tflite::RuntimeShape const&, signed char const*, tflite::RuntimeShape const&, signed char*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#2 0x00007fff1d2c40c0 in tflite::optimized_integer_ops::ConvPerChannel(tflite::ConvParams const&, int const*, int const*, tflite::RuntimeShape const&, signed char const*, tflite::RuntimeShape const&, signed char const*, tflite::RuntimeShape const&, int const*, tflite::RuntimeShape const&, signed char*, tflite::RuntimeShape const&, signed char*, tflite::CpuBackendContext*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#3 0x00007fff1d2c43d2 in void tflite::ops::builtin::conv::EvalQuantizedPerChannel<(tflite::ops::builtin::conv::KernelType)2>(TfLiteContext*, TfLiteNode*, TfLiteConvParams*, tflite::ops::builtin::conv::OpData*, TfLiteTensor const*, TfLiteTensor const*, TfLiteTensor const*, TfLiteTensor*, TfLiteTensor*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#4 0x00007fff1d2c460f in TfLiteStatus tflite::ops::builtin::conv::EvalImpl<(tflite::ops::builtin::conv::KernelType)2, (TfLiteType)9>(TfLiteContext*, TfLiteNode*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#5 0x00007fff1d2d2992 in TfLiteStatus tflite::ops::builtin::conv::Eval<(tflite::ops::builtin::conv::KernelType)2>(TfLiteContext*, TfLiteNode*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#6 0x00007fff1d4c5403 in tflite::Subgraph::Invoke() ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#7 0x00007fff1d4c7eb0 in tflite::Interpreter::Invoke() ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#8 0x00007fff1d212bb8 in tflite::interpreter_wrapper::InterpreterWrapper::Invoke() ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#9 0x00007fff1d209661 in void pybind11::cpp_function::initialize<pybind11_init__pywrap_tensorflow_interpreter_wrapper(pybind11::module&)::{lambda(tflite::interpreter_wrapper::InterpreterWrapper&)#6}, pybind11::object, tflite::interpreter_wrapper::InterpreterWrapper&, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11_init__pywrap_tensorflow_interpreter_wrapper(pybind11::module&)::{lambda(tflite::interpreter_wrapper::InterpreterWrapper&)#6}&&, pybind11::object (*)(tflite::interpreter_wrapper::InterpreterWrapper&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#10 0x00007fff1d2066f2 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()
from /local/home/david/venvs/venv_shanas_py38/lib/python3.8/site-packages/tensorflow/lite/python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
tensorflow/tensorflow#11 0x00000000005ff286 in PyCFunction_Call ()
---Type to continue, or q to quit---
tensorflow/tensorflow#12 0x00000000005ff94f in _PyObject_MakeTpCall ()
tensorflow/tensorflow#13 0x00000000005002df in ?? ()
tensorflow/tensorflow#14 0x000000000057d54b in _PyEval_EvalFrameDefault ()
tensorflow/tensorflow#15 0x000000000060251c in _PyFunction_Vectorcall ()
tensorflow/tensorflow#16 0x0000000000578a0e in _PyEval_EvalFrameDefault ()
tensorflow/tensorflow#17 0x00000000005760ed in _PyEval_EvalCodeWithName ()
tensorflow/tensorflow#18 0x000000000066299e in ?? ()
tensorflow/tensorflow#19 0x0000000000662a77 in PyRun_FileExFlags ()
tensorflow/tensorflow#20 0x000000000066378f in PyRun_SimpleFileExFlags ()
tensorflow/tensorflow#21 0x0000000000687dce in Py_RunMain ()
tensorflow/tensorflow#22 0x0000000000688159 in Py_BytesMain ()
tensorflow/tensorflow#23 0x00007ffff7a03bf7 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
tensorflow/tensorflow#24 0x00000000006073fa in _start ()
The text was updated successfully, but these errors were encountered: