Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUS Error, likely with blas #1410

Open
lorenzoriano opened this issue Jan 15, 2019 · 5 comments
Open

BUS Error, likely with blas #1410

lorenzoriano opened this issue Jan 15, 2019 · 5 comments

Comments

@lorenzoriano
Copy link

Cross post from tensorflow/tensorflow#24844

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Mac Sierra
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary):
    binary
  • TensorFlow version (use command below):
    v1.12.0-rc2-3-ga6d8ffae09 1.12.0
  • Python version:
    3.6.5
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
    no Cuda
  • GPU model and memory:
    not using GPU

You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the current behavior
I'm trying Magenta from Google Brain. When I run python onsets_frames_transcription_transcribe.py --acoustic_run_dir /Users/lorenzori/Downloads/maestro_checkpoint ~/Downloads/test_audio.wav the script starts, then it ends with:

/Users/lorenzori/virtualenvs/test-audio/lib/python3.6/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
  return _inspect.getargspec(target)
/Users/lorenzori/virtualenvs/test-audio/lib/python3.6/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
  return _inspect.getargspec(target)
/Users/lorenzori/virtualenvs/test-audio/lib/python3.6/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
  return _inspect.getargspec(target)
2019-01-10 17:49:57.458805: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Restoring parameters from /Users/lorenzori/Downloads/maestro_checkpoint/train/model.ckpt-maestro
INFO:tensorflow:Starting transcription for /Users/lorenzori/Downloads/test_audio.wav...
INFO:tensorflow:Processing file...
INFO:tensorflow:Running inference...
[1]    20612 bus error  python onsets_frames_transcription_transcribe.py --acoustic_run_dir

Using lldb I obtain the following info:

Process 19280 stopped
* thread #25, stop reason = EXC_BAD_ACCESS (code=2, address=0x700001382000)
    frame #0: 0x0000000110a24805 libopenblasp-r0.3.0.dev.dylib`dgemm_thread_tn + 1541
libopenblasp-r0.3.0.dev.dylib`dgemm_thread_tn:
->  0x110a24805 <+1541>: xchgq  %rdi, -0x40(%rsi)
    0x110a24809 <+1545>: xorl   %edi, %edi
    0x110a2480b <+1547>: xchgq  %rdi, (%rsi)
    0x110a2480e <+1550>: addq   $0x200, %rsi              ; imm = 0x200
Target 0: (python) stopped.
(lldb) bt
* thread #25, stop reason = EXC_BAD_ACCESS (code=2, address=0x700001382000)
  * frame #0: 0x0000000110a24805 libopenblasp-r0.3.0.dev.dylib`dgemm_thread_tn + 1541
    frame #1: 0x00000001108f3e26 libopenblasp-r0.3.0.dev.dylib`cblas_dgemm + 854
    frame #2: 0x0000000105564285 multiarray.cpython-36m-darwin.so`cblas_matrixproduct + 4917
    frame #3: 0x0000000105529d37 multiarray.cpython-36m-darwin.so`PyArray_MatrixProduct2 + 215
    frame #4: 0x000000010552ed1f multiarray.cpython-36m-darwin.so`array_matrixproduct + 191
    frame #5: 0x00000001000d1cbe Python`_PyCFunction_FastCallDict + 463
    frame #6: 0x00000001001362d6 Python`call_function + 489
    frame #7: 0x000000010012f18b Python`_PyEval_EvalFrameDefault + 4811
    frame #8: 0x0000000100136a38 Python`_PyEval_EvalCodeWithName + 1719
    frame #9: 0x000000010013713b Python`fast_function + 218
    frame #10: 0x00000001001362ad Python`call_function + 448
    frame #11: 0x000000010012f224 Python`_PyEval_EvalFrameDefault + 4964
    frame #12: 0x00000001001373db Python`_PyFunction_FastCall + 121
    frame #13: 0x00000001001362ad Python`call_function + 448
    frame #14: 0x000000010012f18b Python`_PyEval_EvalFrameDefault + 4811
    frame #15: 0x0000000100136a38 Python`_PyEval_EvalCodeWithName + 1719
    frame #16: 0x000000010013730b Python`_PyFunction_FastCallDict + 449
    frame #17: 0x0000000100099f21 Python`_PyObject_FastCallDict + 196
    frame #18: 0x0000000100182073 Python`partial_call + 258
    frame #19: 0x0000000100099da2 Python`PyObject_Call + 101
    frame #20: 0x000000010012f3f4 Python`_PyEval_EvalFrameDefault + 5428
    frame #21: 0x0000000100136a38 Python`_PyEval_EvalCodeWithName + 1719
    frame #22: 0x000000010013730b Python`_PyFunction_FastCallDict + 449
    frame #23: 0x0000000100099f21 Python`_PyObject_FastCallDict + 196
    frame #24: 0x000000010009a044 Python`_PyObject_Call_Prepend + 156
    frame #25: 0x0000000100099da2 Python`PyObject_Call + 101
    frame #26: 0x00000001000e4460 Python`slot_tp_call + 50
    frame #27: 0x0000000100099da2 Python`PyObject_Call + 101
    frame #28: 0x00000001212b078e _pywrap_tensorflow_internal.so`tensorflow::PyFuncOp::Compute(tensorflow::OpKernelContext*) + 974
    frame #29: 0x000000012bd82422 libtensorflow_framework.so`tensorflow::(anonymous namespace)::ExecutorState::Process(tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, long long) + 6690
    frame #30: 0x000000012bd895ba libtensorflow_framework.so`std::__1::__function::__func<std::__1::__bind<void (tensorflow::(anonymous namespace)::ExecutorState::*)(tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, long long), tensorflow::(anonymous namespace)::ExecutorState*, tensorflow::(anonymous namespace)::ExecutorState::TaggedNode const&, long long&>, std::__1::allocator<std::__1::__bind<void (tensorflow::(anonymous namespace)::ExecutorState::*)(tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, long long), tensorflow::(anonymous namespace)::ExecutorState*, tensorflow::(anonymous namespace)::ExecutorState::TaggedNode const&, long long&> >, void ()>::operator()() + 58
    frame #31: 0x000000012bdde824 libtensorflow_framework.so`Eigen::NonBlockingThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) + 1876
    frame #32: 0x000000012bdddfd4 libtensorflow_framework.so`std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
    frame #33: 0x000000012be00070 libtensorflow_framework.so`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
    frame #34: 0x00007fffb96b893b libsystem_pthread.dylib`_pthread_body + 180
    frame #35: 0x00007fffb96b8887 libsystem_pthread.dylib`_pthread_start + 286
    frame #36: 0x00007fffb96b808d libsystem_pthread.dylib`thread_start + 13

Describe the expected behavior
The script should run

Code to reproduce the issue
python onsets_frames_transcription_transcribe.py --acoustic_run_dir <checkpoint_dir> <wav_file>

@cghawthorne
Copy link
Contributor

I agree this sounds more like a general TF issue. Have you tried upgrading to the final 1.12.0 release?

@lorenzoriano
Copy link
Author

I used the 1.12.0 version that magenta pulls when installing it via pip. Also, the same problem doesn't happen on Ubuntu 16.04, so it must be some weird TF+mac+magenta combination. Other TF programs on the same mac machine work fine

@pangwong
Copy link

@lorenzoriano So do you have any solution later on? I encountered the same error.

@lorenzoriano
Copy link
Author

No, I never found a solution

@pangwong
Copy link

pangwong commented Feb 15, 2019

I found it was mkl that resulted in this error in my case. Replace mkl-2019 with mkl-2018 as follows solves it.

# Name                    Version                   Build  Channel
mkl                       2018.0.3                 pypi_0    pypi
mkl-fft                   1.0.2                    pypi_0    pypi
mkl-random                1.0.1.1                  pypi_0    pypi

my environment is:

macos majave
tensorflow 1.12.0
magenta 1.0.2
python 3.6.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants