Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running inference on CPU using C++ bindings: Not found: Container localhost does not exist #41226

Closed
TylerTheHumanCompiler opened this issue Jul 9, 2020 · 13 comments
Assignees
Labels
comp:runtime c++ runtime, performance issues (cpu) stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.3 Issues related to TF 2.3 type:bug Bug

Comments

@TylerTheHumanCompiler
Copy link

TylerTheHumanCompiler commented Jul 9, 2020

System information

  • Have I written custom code: yes
  • OS/Distro: Arch Linux
  • TensorFlow installed from: virtualenv with pip, C/C++ bindings from source without CUDA support
  • TensorFlow version virtualenv: v1.12.1-34938-g99fea8da0d 2.3.0-rc0
  • TensorFlow version C/C++ bindings: v1.12.1-35361-ge89160d8d3 2.5.0-dev20200629
  • Python version: 3.7
  • Bazel version: 3.1.0- (@non-git)
  • GCC/Compiler version: 10.1.0
  • CUDA/cuDNN version: 10.2z (Error happens on CPU)
  • GPU model and memory: Nvidia RTX 2080 Ti

Describe the current behavior
Predicting works fine in Python, however when I try to run inference on CPU by using the C++ bindings I get the following error:

Failed precondition: Error while reading resource variable dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense/kernel)
	 [[{{node dense/Tensordot/ReadVariableOp}}]]

Describe the expected behavior
Being able to predict in C++ without issues.

Standalone code to reproduce the issue
Code is on pastebin.com to increase readability. Links:
Training the model in Python
C++ Header
C++ Code
cli_saved_model --dir ... --all output

Other info / logs
A similar issue was already posted on Stackoverflow, however with no answers.
Issue is persistent in TF2.2.

@MikhailStartsev
Copy link
Contributor

Might not be the same cause as for you, but I just had a very similar issue with a self-built TF2.2. In my case it was caused by trying to load a saved model from a folder where I only put a saved_model.pb, but not the variables subfolder. When I copied the variables there as well, it started working again.

@Saduf2019 Saduf2019 added the comp:runtime c++ runtime, performance issues (cpu) label Jul 10, 2020
@Saduf2019
Copy link
Contributor

Saduf2019 commented Jul 10, 2020

@TylerTheHumanCompiler
Please update as per above comment, there is a similar issue please refer and let us know if it helps.
#28287

Also i see you are using a very old version of tf, can you please upgrade to newer version and confirm if it helps resolve your issue.

@Saduf2019 Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Jul 10, 2020
@TylerTheHumanCompiler
Copy link
Author

TylerTheHumanCompiler commented Jul 11, 2020

@Saduf2019 The variables folder exists and is populated. Can I provide you with additional information?

As for the tf update: I've compiled from source and now have:
v1.12.1-35409-ge085901e2c 2.4.0

and:
libtensorflow_cc.so.2.4.0

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Jul 13, 2020
@Saduf2019
Copy link
Contributor

Saduf2019 commented Jul 13, 2020

@TylerTheHumanCompiler
Is there any particular reason for using tf 1.12 when there are later versions available.

Please upgrade the tf version and see if you still face the issue as this is a very old version that does not have support [please use 1.15 or 2.x versions].
Please refer to this link.

@Saduf2019 Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Jul 13, 2020
@TylerTheHumanCompiler
Copy link
Author

TylerTheHumanCompiler commented Jul 14, 2020

@Saduf2019 No, there isn't, because I'm not using tf 1.12, nor was I when I opened this issue. I don't know why tf.version.GIT_VERSION outputs this, but there are reports of it outputting a wrong version here. Note that the hash above ge085901e2c does not match any commit, whereas git rev-list -n 1 v1.12.1-35409-ge085901e2c (as suggested in the linked issue) returns the hash of this commit, which is 15 days old.
Further, tf.version.VERSION outputs 2.4.0, my libraries in /usr/lib are named libtensorflow_cc.so.2.4.0, and libtensorflow_framework.so.2.4.0, and printing the tf version in C++ using TF_VERSION_STRING (defined in core/public/version.h) also gives me 2.4.0.

I used the following code for installation:

git clone git://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout master
./configure
bazel build -c opt --config=numa --config=v2 //tensorflow:libtensorflow_cc.so
bazel build -c opt --config=numa --config=v2 //tensorflow:libtensorflow_framework.so
bazel build --verbose_failures -c opt --config=numa --config=v2 //tensorflow/tools/pip_package:build_pip_package
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow-2.4.0-cp38-cp38-linux_x86_64.whl

@Saduf2019 Saduf2019 added TF 2.3 Issues related to TF 2.3 and removed stat:awaiting response Status - Awaiting response from author labels Jul 15, 2020
@DouglasLivingstone
Copy link

Note that the hash above ge085901e2c does not match any commit

The 'g' at the start seems to mean "git" or "group", the hash itself is the following part: e085901e2c (matching what you found using git rev-list).

@TylerTheHumanCompiler
Copy link
Author

@DouglasLivingstone thanks for shedding light on this.

@gowthamkpr gowthamkpr assigned skye and unassigned gowthamkpr Aug 12, 2020
@gowthamkpr gowthamkpr added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 12, 2020
@skye skye removed their assignment Aug 12, 2020
@skye
Copy link
Member

skye commented Aug 12, 2020

I don't work on TF anymore, sorry!

@tensorflowbutler tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 14, 2020
@TylerTheHumanCompiler
Copy link
Author

Any updates on this?

@sushreebarsa
Copy link
Contributor

sushreebarsa commented Aug 8, 2021

@TylerTheHumanCompiler Could you please try on latest stable version of tf 2.5 and let us know if this is still an issue.Thanks!

@sushreebarsa sushreebarsa self-assigned this Aug 8, 2021
@sushreebarsa sushreebarsa added the stat:awaiting response Status - Awaiting response from author label Aug 8, 2021
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Aug 17, 2021
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:runtime c++ runtime, performance issues (cpu) stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.3 Issues related to TF 2.3 type:bug Bug
Projects
None yet
Development

No branches or pull requests

9 participants