Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chinese image caption, In the result, multiple words of the same type appear #71

Open
cylvzj opened this issue Feb 20, 2020 · 1 comment

Comments

@cylvzj
Copy link

cylvzj commented Feb 20, 2020

Hello, I am using the COCO dataset,
A two-layer LSTM model, one layer for top-down attention, and one layer for language models.

Extracting words with jieba
I used all the words in the picture description that occurred more than 3 times as a dictionary file, and a total of 14,226 words.
words = [w for w in word_freq.keys () if word_freq [w]> 3]

After training the model, when using it, multiple words of the same type appear in the result, such as:

Note notebook laptop computer on bed
A little girl little girl girl standing together

How can I solve this problem?

@HuitMahoon
Copy link

Did you solve the problem?
I retrain the model with Chinese caption. when testing the image which in the training set, I got multiple same words. And I don't know why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants