Skip to content
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.

Question from character level RNN classifier, why not use the hidden state across epochs? #139

Open
labJunky opened this issue Nov 27, 2019 · 1 comment

Comments

@labJunky
Copy link

labJunky commented Nov 27, 2019

In the RNN classification example, using characters of names to predict the names language, the train function re-zeros the hidden state (and gradient) every epoch. I was wondering why this is done, instead of carrying over the final hidden states of the epoch before?

@ZhouXing19
Copy link

One epoch means a run-through of a word. If we start a new epoch, which means we are training the network with a new word, we need to redefine the hidden state of the initial letter of the new word, since states of different words are independent.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants