Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shakespeare - no decent results. #88

Open
eileen-bluerose opened this issue May 3, 2019 · 1 comment
Open

Shakespeare - no decent results. #88

eileen-bluerose opened this issue May 3, 2019 · 1 comment

Comments

@eileen-bluerose
Copy link

I ran shakespeare example overnight and got some awkward results on both hackage release packet:
shakespeare_output_from_clean_cabal_installation.txt
and also from freshly downloaded source code:
shakespeare_output_from_github_source.txt

I used the training data proposed in the source code:
https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt

How should I run the example to get this code to work and produce more realistic output (like the generated sequence from example)?

@HuwCampbell
Copy link
Owner

Hi Celina,

This one unfortunately takes a long time.

There are a few reasons for this, the biggest of which is minibatching. For convolution NN (which was the first proving ground for grenade), almost all non-trivial computations are matrix matrix multiplications, so minibatching doesn't give a huge computational benefit, and I left it out for semantic clarity; for LSTMs however, it's mostly matrix vector multiplications, which become matrix matrix with minibatching. This makes a big difference actually, as a 50 column wide matrix takes only about 5 times as much as a matrix vector op with decent BLAS libraries.

The other two things I have done differently are not propagating the previous batch's input vector (to line them up for truncated back-prop through time) and using only SGD with momentum instead of ADAM or some other optimiser.

If memory serves, I trained for over 24 hours, slowly ramping up the training size, starting with about 15 characters until spaces were interspersed well, then working up to 25 and 50 when words and basic grammar appeared.

Huw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants