LSTM forget gate bias initialization #750

ikostrikov · 2017-02-15T19:27:50Z

Some papers suggest to set forget gate bias of LSTMs to a specific value. For example:
http://jmlr.org/proceedings/papers/v37/jozefowicz15.pdf

Is it possible to do using current implementation of LSTM/LSTMCell?

apaszke · 2017-02-17T14:57:32Z

Yes, the ordering of weights a biases is the same for all implementations and is ingate, forgetgate, cellgate, outgate. You need to initialize the values between 1/4 and 1/2 of the bias vector to the desired value.

kellywzhang · 2017-02-20T17:36:15Z

What is the difference between "bias_ih" and "bias_hh" in the LSTM and GRU cells? Should both be initialized with ones between 1/4 and 1/2?

apaszke · 2017-02-23T23:51:53Z

One of them is added to the linear transform of the input, another one to the hidden transform. It's redundant - there could be only one bias, and the model would be equivalent. However, that's what cuDNN does, so we preferred to keep it like that for consistency.

forget gate bias initialization. Now set the bias for both the hidden states input and the memory state input. It is still not clear what should be the best value: 0.5, 1, 2, something else? See: http://proceedings.mlr.press/v37/jozefowicz15.pdf pytorch/pytorch#750 modified: modules/multi_dimensional_lstm.py modified: modules/multi_dimensional_lstm_parameters.py modified: modules/train_multi_dimensional_rnn.py

Otherwise, mkl-2021.2 gets installed which contains both .so and .so.1 binaries

apaszke closed this as completed Feb 17, 2017

ShigekiKarita mentioned this issue Mar 8, 2018

Pytorch lm training script espnet/espnet#114

Merged

Kaixhin mentioned this issue Jul 18, 2018

[feature request] RNN Wishlist #9572

Closed

12 tasks

kngwyu mentioned this issue Apr 12, 2019

Fix bias initialization for LSTM kngwyu/Rainy#17

Merged

KyleCZH pushed a commit to KyleCZH/pytorch that referenced this issue Sep 20, 2021

Pin mkl to 2020.2 for builds (pytorch#750)

da1928b

Otherwise, mkl-2021.2 gets installed which contains both .so and .so.1 binaries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM forget gate bias initialization #750

LSTM forget gate bias initialization #750

ikostrikov commented Feb 15, 2017

apaszke commented Feb 17, 2017

kellywzhang commented Feb 20, 2017

apaszke commented Feb 23, 2017

LSTM forget gate bias initialization #750

LSTM forget gate bias initialization #750

Comments

ikostrikov commented Feb 15, 2017

apaszke commented Feb 17, 2017

kellywzhang commented Feb 20, 2017

apaszke commented Feb 23, 2017