Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) in tf.raw_ops.FusedResizeAndPadConv2D #67529

Open
LongZE666 opened this issue May 14, 2024 · 1 comment
Open
Assignees
Labels
comp:ops OPs related issues TF 2.16 type:bug Bug

Comments

@LongZE666
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

tf 2.16.1

Custom code

Yes

OS platform and distribution

Ubuntu 20.04

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

Illegal size will trigger a segfault.

Standalone code to reproduce the issue

# Signal --4;2024-05-13 05:48:16.272992: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.2024-05-13 05:48:16.273352: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.2024-05-13 05:48:16.277715: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.2024-05-13 05:48:16.331241: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2024-05-13 05:48:17.210828: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT2024-05-13 05:48:17.792175: E tensorflow/core/util/util.cc:131] oneDNN supports DT_HALF only on platforms with AVX-512. Falling back to the default Eigen-based implementation if present.

# FusedResizeConv2DUsingGemmOp

import tensorflow as tf

mode = "REFLECT"
strides = [1, 1, 1, 1]
padding = "VALID"
resize_align_corners = False
input = tf.constant(0, shape=[1,2,3,2], dtype=tf.float16)
size = tf.constant([65534,65534], shape=[2], dtype=tf.int32)
paddings = tf.constant(0, shape=[4,2], dtype=tf.int32)
filter = tf.constant(0, shape=[1,2,2,2], dtype=tf.float16)
tf.raw_ops.FusedResizeAndPadConv2D(input=input, size=size, paddings=paddings, filter=filter, mode=mode, strides=strides, padding=padding, resize_align_corners=resize_align_corners)

Relevant log output

2024-05-14 08:31:20.675971: E tensorflow/core/util/util.cc:131] oneDNN supports DT_HALF only on platforms with AVX-512. Falling back to the default Eigen-based implementation if present.
Segmentation fault (core dumped)

ASAN report:

AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizerAddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizerAddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizer==2060653==ERROR: AddressSanitizer: SEGV on unknown address 0x7fbb6ce8a800 (pc 0x7fc21cc305a9 bp 0x7fc1b01c05c0 sp 0x7fc1b01c03b0 T93)
:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
==2060653==The signal is caused by a WRITE memory access.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizerAddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
:DEADLYSIGNAL
    #0 0x7fc21cc305a9 in Eigen::TensorEvaluator<Eigen::TensorContractionOp<Eigen::array<Eigen::IndexPair<long>, 1ul> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::NoOpOutputKernel const> const, Eigen::ThreadPoolDevice>::EvalParallelContext<Eigen::TensorEvaluator<Eigen::TensorContractionOp<Eigen::array<Eigen::IndexPair<long>, 1ul> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::NoOpOutputKernel const> const, Eigen::ThreadPoolDevice>::NoCallback, true, true, false, 0>::pack_rhs(long, long) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x394ae5a9)
    #1 0x7fc21cc31692 in Eigen::TensorEvaluator<Eigen::TensorContractionOp<Eigen::array<Eigen::IndexPair<long>, 1ul> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::NoOpOutputKernel const> const, Eigen::ThreadPoolDevice>::EvalParallelContext<Eigen::TensorEvaluator<Eigen::TensorContractionOp<Eigen::array<Eigen::IndexPair<long>, 1ul> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::NoOpOutputKernel const> const, Eigen::ThreadPoolDevice>::NoCallback, true, true, false, 0>::enqueue_packing_helper(long, long, long, bool) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x394af692)
    #2 0x7fc24fd0b121 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x41fd121)
    #3 0x7fc24fd00326 in void absl::lts_20230802::internal_any_invocable::RemoteInvoker<false, void, tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}&>(absl::lts_20230802::internal_any_invocable::TypeErasedState*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x41f2326)
    #4 0x7fc24e62625c in tsl::(anonymous namespace)::PThread::ThreadFn(void*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x2b1825c)
    #5 0x7fc2f9885ac2  (/lib/x86_64-linux-gnu/libc.so.6+0x94ac2)
    #6 0x7fc2f9916a03 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x125a03)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x394ae5a9) in Eigen::TensorEvaluator<Eigen::TensorContractionOp<Eigen::array<Eigen::IndexPair<long>, 1ul> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::NoOpOutputKernel const> const, Eigen::ThreadPoolDevice>::EvalParallelContext<Eigen::TensorEvaluator<Eigen::TensorContractionOp<Eigen::array<Eigen::IndexPair<long>, 1ul> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorMap<Eigen::Tensor<Eigen::half const, 2, 1, long>, 16, Eigen::MakePointer> const, Eigen::NoOpOutputKernel const> const, Eigen::ThreadPoolDevice>::NoCallback, true, true, false, 0>::pack_rhs(long, long)
Thread T93 created by T0 here:
    #0 0x7fc2f9bab685 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x7fc24e633fbf in tsl::(anonymous namespace)::PThread::PThread(tsl::ThreadOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, absl::lts_20230802::AnyInvocable<void ()>) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x2b25fbf)
    #2 0x7fc24e63450a in tsl::(anonymous namespace)::PosixEnv::StartThread(tsl::ThreadOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, absl::lts_20230802::AnyInvocable<void ()>) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x2b2650a)
    #3 0x7fc24fd093a8 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::ThreadPoolTempl(int, bool, tsl::thread::EigenEnvironment) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x41fb3a8)
    #4 0x7fc24fd0d498 in tsl::thread::ThreadPool::ThreadPool(tsl::Env*, tsl::ThreadOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, bool, Eigen::Allocator*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x41ff498)
    #5 0x7fc24d7ef278 in tensorflow::LocalDevice::EigenThreadPoolInfo::EigenThreadPoolInfo(tensorflow::SessionOptions const&, int, tsl::Allocator*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1ce1278)
    #6 0x7fc24d7f05c5 in tensorflow::LocalDevice::LocalDevice(tensorflow::SessionOptions const&, tensorflow::DeviceAttributes const&) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1ce25c5)
    #7 0x7fc24d7e3111 in tensorflow::ThreadPoolDevice::ThreadPoolDevice(tensorflow::SessionOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tsl::gtl::IntType<tensorflow::Bytes_tag_, long>, tensorflow::DeviceLocality const&, tsl::Allocator*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1cd5111)
    #8 0x7fc24d7dc1dd in tensorflow::ThreadPoolDeviceFactory::CreateDevices(tensorflow::SessionOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> >, std::allocator<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> > > >*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1cce1dd)
    #9 0x7fc24d9a667c in tensorflow::DeviceFactory::AddCpuDevices(tensorflow::SessionOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> >, std::allocator<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> > > >*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1e9867c)
    #10 0x7fc24d9a6bbd in tensorflow::DeviceFactory::AddDevices(tensorflow::SessionOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> >, std::allocator<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> > > >*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1e98bbd)
    #11 0x7fc1f3dd6db8 in TFE_NewContext (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x10654db8)
    #12 0x7fc1df839e70 in pybind11::cpp_function::initialize<pybind11_init__pywrap_tfe(pybind11::module_&)::{lambda(TFE_ContextOptions const*)#9}, pybind11::object, TFE_ContextOptions const*, pybind11::name, pybind11::scope, pybind11::sibling, pybind11::return_value_policy>(pybind11_init__pywrap_tfe(pybind11::module_&)::{lambda(TFE_ContextOptions const*)#9}&&, pybind11::object (*)(TFE_ContextOptions const*), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/_pywrap_tfe.so+0x1b6e70)
    #13 0x7fc1df851899 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/_pywrap_tfe.so+0x1ce899)
    #14 0x51ad66  (/usr/bin/python3.11+0x51ad66)

==2060653==ABORTING
@sushreebarsa
Copy link
Contributor

@SuryanarayanaY I was able to replicate the issue reported here. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:ops OPs related issues TF 2.16 type:bug Bug
Projects
None yet
Development

No branches or pull requests

4 participants