Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow.python.framework.errors_impl.NotFoundError: Key is_training not found in checkpoint #113

Open
luhawk803 opened this issue Mar 21, 2019 · 0 comments

Comments

@luhawk803
Copy link

Hi Mr.Brightmart
@brightmart when I try to run fast text, I got an error like this.

I try to run a01_FastText,
steps:

  1. python p6_fastTextB_train_multilabel.py
    everything works fine, then I got fast_text_checkpoint_multi folder with file tree like following: fast_text_checkpoint_multi
    ├── checkpoint
    ├── model.ckpt-5.data-00000-of-00001
    ├── model.ckpt-5.index
    ├── model.ckpt-5.meta
    ├── model.ckpt-6.data-00000-of-00001
    ├── model.ckpt-6.index
    ├── model.ckpt-6.meta
    ├── model.ckpt-7.data-00000-of-00001
    ├── model.ckpt-7.index
    ├── model.ckpt-7.meta
    ├── model.ckpt-8.data-00000-of-00001
    ├── model.ckpt-8.index
    ├── model.ckpt-8.meta
    ├── model.ckpt-9.data-00000-of-00001
    ├── model.ckpt-9.index
    └── model.ckpt-9.meta

  2. I try to run
    python p5_fastTextB_predict_multilabel.py
    I got following errors:

started...
ended...
('cache_path:', 'cache_vocabulary_label_pik/_word_voabulary.pik', 'file_exists:', True)
('vocab_size:', 142040)
('create_voabulary_label_sorted.started.traning_data_path:', 'train-zhihu4-only-title-all.txt')
('length of total question lists:', 0)
('number_examples:', 0)
start padding....
end padding...
2019-03-21 12:12:45.623531: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
Restoring Variables from Checkpoint
$$$        fast_text_checkpoint_multi/model.ckpt-9
2019-03-21 12:12:45.669992: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key is_training not found in checkpoint
Traceback (most recent call last):
  File "p5_fastTextB_predict_multilabel.py", line 101, in <module>
    tf.app.run()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "p5_fastTextB_predict_multilabel.py", line 66, in main
    saver.restore(sess,tf.train.latest_checkpoint(FLAGS.ckpt_dir))
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1802, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key is_training not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, DT_BOOL], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op u'save/RestoreV2', defined at:
  File "p5_fastTextB_predict_multilabel.py", line 101, in <module>
    tf.app.run()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "p5_fastTextB_predict_multilabel.py", line 62, in main
    saver=tf.train.Saver()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1338, in __init__
    self.build()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1347, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1384, in _build
    build_save=build_save, build_restore=build_restore)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 835, in _build_internal
    restore_sequentially, reshape)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 472, in _AddRestoreOps
    restore_sequentially)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 886, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key is_training not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_INT32, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, DT_BOOL], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Could you help me with this, thanks ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant