Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow 1.11.0 Error #25

Open
DEKHTIARJonathan opened this issue Oct 9, 2018 · 9 comments
Open

Tensorflow 1.11.0 Error #25

DEKHTIARJonathan opened this issue Oct 9, 2018 · 9 comments

Comments

@DEKHTIARJonathan
Copy link

You have an error with TF 1.11.0. It's impossible to build the library.

image

I manage to build it with TF 1.8.0 / 1.9.0 / 1.10.0 but with TF 1.11.0 it's broken

@clhne
Copy link

clhne commented Oct 11, 2018

So, how to fix it in TF 1.11 ?

@DEKHTIARJonathan
Copy link
Author

DEKHTIARJonathan commented Oct 11, 2018

Unfortunately, I don't have the time to investigate. Just wanted to mention it.

Sent from my Galaxy S9+ using FastHub

@andrei-pokrovsky
Copy link
Contributor

andrei-pokrovsky commented Oct 13, 2018

We have limited resources for supporting/testing various combinations of versions, packages, linux distributions etc. We currently do not support python 3.6, looks like the error happens with python 3.6?

We also currently support tensorflow 1.8 for this project and haven't upgraded to 1.11 yet.

@DEKHTIARJonathan
Copy link
Author

Thanks @andrei-pokrovsky for the feedback, as I said above, I created this issue just to let you know that it is not working with TF 1.11.0.

I believe it has nothing to do with Python 3.5/3.6, nonetheless it is worth trying ;)

@DEKHTIARJonathan
Copy link
Author

@andrei-pokrovsky I can give you a hint, it seems that there is an issue with TF 1.11.0.

tensorflow/tensorflow#22766

@mixxen
Copy link

mixxen commented May 5, 2019

Compiled on TF 1.12, but failed a couple of tests. To compile, edit file: /usr/local/lib/python3.5/dist-packages/tensorflow/include/absl/strings/string_view.h

and remove ABSL_ASSERT from line 496

Output from make test:

cd ../benchmark && bash run_all_unittests.bash # unit tests
test_basic (reduce_mask_tests.ReduceMaskTests) ... ok
test_larger (reduce_mask_tests.ReduceMaskTests) ... 2019-05-05 20:24:07.045607: E tensorflow/core/common_runtime/executor.cc:623] Executor failed to create kernel. Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
	 [[{{node MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 5, 5], padding="VALID", strides=[1, 1, 4, 4], _device="/job:localhost/replica:0/task:0/device:GPU:0"](MaxPool-0-TransposeNHWCToNCHW-LayoutOptimizer)]]
ok
test_session (reduce_mask_tests.ReduceMaskTests)
Use cached_session instead. ... ok
test_session (sparse_conv_tests.SparseConv2DCustomTests)
Use cached_session instead. ... ok
test_sparse_conv2d_with_mask_same (sparse_conv_tests.SparseConv2DCustomTests) ... ok
test_session (sparse_conv_tests.SparseConv2DTests)
Use cached_session instead. ... ok
test_sparse_conv2d_correctness (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_matmul_correctness (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_same (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_valid (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_with_large_block_strides (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_with_mask_same (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_with_mask_same_even_block (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_with_mask_same_even_block_strides (sparse_conv_tests.SparseConv2DTests) ... ok
test_sparse_conv2d_with_mask_valid (sparse_conv_tests.SparseConv2DTests) ... ok
test_offset_array (sparse_conv_tests.UpsampleIndicesTests) ... ok
test_session (sparse_conv_tests.UpsampleIndicesTests)
Use cached_session instead. ... ok
test_upsample_indices (sparse_conv_tests.UpsampleIndicesTests) ... ok
test_basic (sparse_gather_tests.SparseGatherTests) ... ok
test_large (sparse_gather_tests.SparseGatherTests) ... 2019-05-05 20:24:24.806640: E tensorflow/core/common_runtime/executor.cc:623] Executor failed to create kernel. Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
	 [[{{node MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 5, 5], padding="VALID", strides=[1, 1, 4, 4], _device="/job:localhost/replica:0/task:0/device:GPU:0"](MaxPool-0-TransposeNHWCToNCHW-LayoutOptimizer)]]
FAIL
test_session (sparse_gather_tests.SparseGatherTests)
Use cached_session instead. ... ok
test_resblock_gradients (sparse_res_block_tests.ResBlockGradientTests) ... /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  return _inspect.getargspec(target)

-------------------------------------------------------
Dense Residual
name                           grad angle    abs err
x                                  21.745      0.653
sub3/conv3/Conv2D:0                 0.020      0.000
sub3/relu3:0                        0.000      0.000
sub3/bn3/FusedBatchNorm:0           0.000      0.000
sub2/conv2/Conv2D:0                 2.916      0.236
sub2/relu2:0                        3.595      0.169
sub2/bn2/FusedBatchNorm:0           3.987      0.169
sub1/conv1/Conv2D:0                 3.129      0.397
sub1/relu1:0                        3.126      0.307
sub1/bn1/FusedBatchNorm:0           3.330      0.307
sub1/bn1/gamma:0                   17.718      0.645
sub1/bn1/beta:0                    12.391      0.355
sub1/conv1/w:0                     15.586      0.825
sub2/bn2/gamma:0                    1.541      0.287
sub2/bn2/beta:0                    41.559      0.501
sub2/conv2/w:0                     14.427      0.620
sub3/bn3/gamma:0                    0.000      0.573
sub3/bn3/beta:0                     0.020      0.000
sub3/conv3/w:0                      0.020      0.000
ok
test_session (sparse_res_block_tests.ResBlockGradientTests)
Use cached_session instead. ... ok
test_session (sparse_res_block_tests.SparseConv2DGradientTests)
Use cached_session instead. ... ok
test_sparse_conv2d_gradient (sparse_res_block_tests.SparseConv2DGradientTests) ... 
-------------------------------------------------------
Sparse Conv Layer
name                           grad angle    abs err
x                                   0.059      0.000
w                                   0.000      0.000
ok
test_session (sparse_res_block_tests.SparseResBlockGradientTests)
Use cached_session instead. ... ok
test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests) ... 
-------------------------------------------------------
Sparse Residual
name                           grad angle    abs err
x                                   1.587      0.930
SparseScatter:0                     0.040      0.000
SparseGather:0                      1.601      0.930
sub3/bn3/FusedBatchNorm:0           0.000      0.000
sub3/conv3/Conv2D:0                 0.028      0.000
sub3/relu3:0                        0.000      0.000
sub2/conv2/Conv2D:0                 0.052      0.000
sub2/relu2:0                        0.044      0.000
sub2/bn2/FusedBatchNorm:0           0.044      0.000
sub1/conv1/Conv2D:0                 1.386      0.432
sub1/relu1:0                        1.011      0.285
sub1/bn1/FusedBatchNorm:0           1.011      0.285
sub1/bn1/gamma:0                    0.028      0.000
sub1/bn1/beta:0                     0.000      0.000
sub1/conv1/w:0                      0.000      0.000
sub2/bn2/gamma:0                    1.610      0.000
sub2/bn2/beta:0                     0.000      0.000
sub2/conv2/w:0                      0.000      0.000
sub3/bn3/gamma:0                    0.000      0.000
sub3/bn3/beta:0                     0.020      0.000
sub3/conv3/w:0                      0.020      0.000

-------------------------------------------------------
Sparse Residual
name                           grad angle    abs err
x                                   1.587      0.930
SparseScatter:0                     0.040      0.000
SparseGather:0                      1.601      0.930
sub3/bn3/FusedBatchNorm:0           0.000      0.000
sub3/conv3/Conv2D:0                 0.028      0.000
sub3/relu3:0                        0.000      0.000
sub2/conv2/Conv2D:0                 0.052      0.000
sub2/relu2:0                        0.044      0.000
sub2/bn2/FusedBatchNorm:0           0.044      0.000
sub1/conv1/Conv2D:0                 1.386      0.432
sub1/relu1:0                        1.011      0.285
sub1/bn1/FusedBatchNorm:0           1.011      0.285
sub1/bn1/gamma:0                    0.028      0.000
sub1/bn1/beta:0                     0.000      0.000
sub1/conv1/w:0                      0.000      0.000
sub2/bn2/gamma:0                    1.610      0.000
sub2/bn2/beta:0                     0.000      0.000
sub2/conv2/w:0                      0.000      0.000
sub3/bn3/gamma:0                    0.000      0.000
sub3/bn3/beta:0                     0.020      0.000
sub3/conv3/w:0                      0.020      0.000
ok
test_basic (sparse_scatter_tests.SparseScatterTests) ... ok
test_session (sparse_scatter_tests.SparseScatterTests)
Use cached_session instead. ... ok
test_calc_out_size (tf_conv_dims_tests.CalcOutSizeDeconvTests) ... ok
test_session (tf_conv_dims_tests.CalcOutSizeDeconvTests)
Use cached_session instead. ... ok
test_calc_out_size (tf_conv_dims_tests.CalcOutSizeTests) ... ok
test_session (tf_conv_dims_tests.CalcOutSizeTests)
Use cached_session instead. ... ok
test_calc_padding (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_err_ksize_list (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_err_strides_list (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_err_strides_tensor (tf_conv_dims_tests.CalcPaddingTests) ... ERROR::tensorflow:assertion failed: [Expect first and last dimension of `strides` = 1.] [Condition x == y did not hold element-wise:] [x (stack:0) = ] [2 1] [y (Const_1:0) = ] [1 1]
	 [[node assert_equal/Assert/AssertGuard/Assert (defined at /workspace/sbnet/sbnet_tensorflow/benchmark/tf_conv_dims.py:67)  = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](assert_equal/Assert/AssertGuard/Assert/Switch/_11, assert_equal/Assert/AssertGuard/Assert/data_0, assert_equal/Assert/AssertGuard/Assert/data_1, assert_equal/Assert/AssertGuard/Assert/data_2, assert_equal/Assert/AssertGuard/Assert/Switch_1/_13, assert_equal/Assert/AssertGuard/Assert/data_4, assert_equal/Assert/AssertGuard/Assert/Switch_2/_15)]]
	 [[{{node assert_equal/Assert/AssertGuard/Assert/_18}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_55_assert_equal/Assert/AssertGuard/Assert", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op 'assert_equal/Assert/AssertGuard/Assert', defined at:
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.5/unittest/__main__.py", line 18, in <module>
    main(module=None)
  File "/usr/lib/python3.5/unittest/main.py", line 94, in __init__
    self.runTests()
  File "/usr/lib/python3.5/unittest/main.py", line 255, in runTests
    self.result = testRunner.run(self.test)
  File "/usr/lib/python3.5/unittest/runner.py", line 176, in run
    test(result)
  File "/usr/lib/python3.5/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.5/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib/python3.5/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.5/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib/python3.5/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.5/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib/python3.5/unittest/case.py", line 648, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.5/unittest/case.py", line 600, in run
    testMethod()
  File "/workspace/sbnet/sbnet_tensorflow/benchmark/tf_conv_dims_tests.py", line 109, in test_calc_padding_err_strides_tensor
    p = calc_padding_4d(tf.shape(x), [2, 3, 1, 1], tf.constant(np.array([2, 1, 1, 1])), 'SAME')
  File "/workspace/sbnet/sbnet_tensorflow/benchmark/tf_conv_dims.py", line 106, in calc_padding_4d
    strides = _check_strides(strides)
  File "/workspace/sbnet/sbnet_tensorflow/benchmark/tf_conv_dims.py", line 67, in _check_strides
    message='Expect first and last dimension of `strides` = 1.')
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/check_ops.py", line 390, in assert_equal
    return control_flow_ops.Assert(condition, data, summarize=summarize)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_should_use.py", line 189, in wrapped
    return _add_should_use_warning(fn(*args, **kwargs))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 167, in Assert
    guarded_assert = cond(condition, no_op, true_assert, name="AssertGuard")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2097, in cond
    orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1930, in BuildCondBranch
    original_result = fn()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 165, in true_assert
    condition, data, summarize, name="Assert")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_logging_ops.py", line 52, in _assert
    name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): assertion failed: [Expect first and last dimension of `strides` = 1.] [Condition x == y did not hold element-wise:] [x (stack:0) = ] [2 1] [y (Const_1:0) = ] [1 1]
	 [[node assert_equal/Assert/AssertGuard/Assert (defined at /workspace/sbnet/sbnet_tensorflow/benchmark/tf_conv_dims.py:67)  = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](assert_equal/Assert/AssertGuard/Assert/Switch/_11, assert_equal/Assert/AssertGuard/Assert/data_0, assert_equal/Assert/AssertGuard/Assert/data_1, assert_equal/Assert/AssertGuard/Assert/data_2, assert_equal/Assert/AssertGuard/Assert/Switch_1/_13, assert_equal/Assert/AssertGuard/Assert/data_4, assert_equal/Assert/AssertGuard/Assert/Switch_2/_15)]]
	 [[{{node assert_equal/Assert/AssertGuard/Assert/_18}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_55_assert_equal/Assert/AssertGuard/Assert", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

ok
test_calc_padding_stride (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_valid (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_session (tf_conv_dims_tests.CalcPaddingTests)
Use cached_session instead. ... ok

======================================================================
FAIL: test_large (sparse_gather_tests.SparseGatherTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/workspace/sbnet/sbnet_tensorflow/benchmark/sparse_gather_tests.py", line 122, in test_large
    self._test_sparse_gather(mask, x, w, bsize, ksize, strides, padding)
  File "/workspace/sbnet/sbnet_tensorflow/benchmark/sparse_gather_tests.py", line 79, in _test_sparse_gather
    np.testing.assert_array_equal(set(l1), set(l2))
  File "/usr/local/lib/python3.5/dist-packages/numpy/testing/_private/utils.py", line 896, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/usr/local/lib/python3.5/dist-packages/numpy/testing/_private/utils.py", line 819, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

Mismatch: 100%
 x: array({(0.0, 28.0, 29.0), (0.0, 44.0, 45.0), (6.0, 7.0, 0.0), (0.0, 1.0, 2.0), (24.0, 25.0, 26.0), (9.0, 10.0, 11.0), (18.0, 19.0, 20.0), (39.0, 0.0, 36.0), (52.0, 53.0, 54.0), (27.0, 28.0, 32.0), (12.0, 13.0, 14.0), (37.0, 38.0, 39.0), (3.0, 4.0, 8.0), (36.0, 4.0, 5.0), (61.0, 62.0, 63.0), (55.0, 0.0, 60.0), (21.0, 22.0, 23.0), (0.0, 0.0, 0.0), (30.0, 31.0, 0.0), (46.0, 47.0, 0.0), (33.0, 34.0, 35.0), (15.0, 0.0, 20.0), (12.0, 16.0, 17.0), (36.0, 37.0, 38.0)},
      dtype=object)
 y: array({(0.0, 60.0, 61.0), (0.0, 12.0, 13.0), (0.0, 36.0, 37.0), (0.0, 1.0, 2.0), (20.0, 21.0, 22.0), (24.0, 25.0, 26.0), (23.0, 0.0, 28.0), (62.0, 63.0, 0.0), (14.0, 15.0, 0.0), (5.0, 6.0, 7.0), (9.0, 10.0, 11.0), (18.0, 19.0, 20.0), (29.0, 30.0, 31.0), (27.0, 28.0, 32.0), (0.0, 0.0, 4.0), (3.0, 4.0, 8.0), (38.0, 39.0, 0.0), (36.0, 36.0, 37.0), (0.0, 0.0, 0.0), (44.0, 45.0, 46.0), (33.0, 34.0, 35.0), (47.0, 0.0, 52.0), (53.0, 54.0, 55.0), (12.0, 16.0, 17.0)},
      dtype=object)

----------------------------------------------------------------------
Ran 40 tests in 330.915s

FAILED (failures=1)
Makefile:14: recipe for target 'test' failed
make: *** [test] Error 1

@frankfengdi
Copy link

Anyone successfully complied the codes with newer tf (e.g. 1.15) and passed the test?

@frankfengdi
Copy link

Anyone successfully complied the codes with newer tf (e.g. 1.15) and passed the test?

Finally it worked, my environment setup:

CUDA 9.0 + tensorflow 1.12 + several modifications in cuda_runtime_api.h

Also, I set D_GLIBCXX_USE_CXX11_ABI=1 even though I am using g++ 5.4

@maryumja
Copy link

Can you tell me what are the modifications you made cuda_runtime_api.h?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants