Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM forget gate bias initialization #750

Closed
ikostrikov opened this issue Feb 15, 2017 · 3 comments
Closed

LSTM forget gate bias initialization #750

ikostrikov opened this issue Feb 15, 2017 · 3 comments

Comments

@ikostrikov
Copy link

Some papers suggest to set forget gate bias of LSTMs to a specific value. For example:
http://jmlr.org/proceedings/papers/v37/jozefowicz15.pdf

Is it possible to do using current implementation of LSTM/LSTMCell?

@apaszke
Copy link
Contributor

apaszke commented Feb 17, 2017

Yes, the ordering of weights a biases is the same for all implementations and is ingate, forgetgate, cellgate, outgate. You need to initialize the values between 1/4 and 1/2 of the bias vector to the desired value.

@apaszke apaszke closed this as completed Feb 17, 2017
@kellywzhang
Copy link

What is the difference between "bias_ih" and "bias_hh" in the LSTM and GRU cells? Should both be initialized with ones between 1/4 and 1/2?

@apaszke
Copy link
Contributor

apaszke commented Feb 23, 2017

One of them is added to the linear transform of the input, another one to the hidden transform. It's redundant - there could be only one bias, and the model would be equivalent. However, that's what cuDNN does, so we preferred to keep it like that for consistency.

gwenniger added a commit to gwenniger/multi-hare that referenced this issue Jun 25, 2019
forget gate bias initialization. Now set the bias for both the
hidden states input and the memory state input. It is still not clear what
should be the best value: 0.5, 1, 2, something else?
See: http://proceedings.mlr.press/v37/jozefowicz15.pdf
pytorch/pytorch#750

	modified:   modules/multi_dimensional_lstm.py
	modified:   modules/multi_dimensional_lstm_parameters.py
	modified:   modules/train_multi_dimensional_rnn.py
KyleCZH pushed a commit to KyleCZH/pytorch that referenced this issue Sep 20, 2021
Otherwise, mkl-2021.2 gets installed which contains both .so and .so.1 binaries
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants