Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot train when new vocab file is generated . #78

Open
rkolaf opened this issue Oct 21, 2019 · 1 comment
Open

cannot train when new vocab file is generated . #78

rkolaf opened this issue Oct 21, 2019 · 1 comment

Comments

@rkolaf
Copy link

rkolaf commented Oct 21, 2019

I included some of your own data and my own data and created vocab file. But when I trained the model bottrainer.py it starts for running first epoch and then stops giving the error as:
Traceback (most recent call last):
File "bottrainer.py", line 161, in
bt.train(res_dir)
File "bottrainer.py", line 88, in train
step_result = self.model.train_step(sess, learning_rate=learning_rate)
File "/home/ruksin/PycharmProjects/chatlearner/chatbot/modelcreator.py", line 122, in train_step
feed_dict={self.learning_rate: learning_rate})
File "/home/ruksin/PycharmProjects/chatlearner/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/home/ruksin/PycharmProjects/chatlearner/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ruksin/PycharmProjects/chatlearner/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/home/ruksin/PycharmProjects/chatlearner/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: pos 0 out of range for stringb'' at index 0
[[Node: Substr = Substr[T=DT_INT32](arg0, Substr/pos, Substr/len)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,?], [?,?], [?,?], [?], [?]], output_types=[DT_INT32, DT_INT32, DT_INT32, DT_INT32, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

can anyone please help me with this. Im I doing any wrong?
i reduced the batch size to 2 and also the units to 800. I was able to train with your own data before , but when i tried it for my own data along with some of your data it shows this.

the vocab generator python file shows this result:
Vocab size after all base data files scanned: 256
Vocab size after cornell data file scanned: 18354
The final vocab file generated. Vocab size: 18380
Please can anyone help me with this.

@bshao001
Copy link
Owner

It is very likely that your data is having some kind of problem, but I cannot tell without looking at the details. You can google around, and I believe I have seen similar errors before from someone else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants