Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM weights not optimised #11

Open
timmeinhardt opened this issue Jun 21, 2019 · 3 comments
Open

LSTM weights not optimised #11

timmeinhardt opened this issue Jun 21, 2019 · 3 comments

Comments

@timmeinhardt
Copy link

timmeinhardt commented Jun 21, 2019

The LayerNormLSTMCell modules initialised in the MetaOptimizer class are not properly registered as parameters of the MetaOptimizer model. Appending them to the self.lstms list:

self.lstms.append(LayerNormLSTMCell(hidden_size, hidden_size))

will not add their trainable parameters to the model parameter list in:

optimizer = optim.Adam(meta_optimizer.parameters(), lr=1e-3)

If I am not mistaken, the current version will not train the LSTM weights at all. In general, I would suggest to restructure the initialisation and MetaOptimizer.forward method, but as a quick fix one could replace the entire self.lstms initialization block with this:

self.lstms = nn.Sequential(*[LayerNormLSTMCell(hidden_size, hidden_size)
                                                for _ in range(num_layers)])
@YanwenZhu
Copy link

This quick fix worked, thanks! By the way, have you been able to re-implement the experiments which were in the article using MetaOptimizer? For me the final loss of each epoch is really big, which is around 4 for best, and I cannot figure out what to do. Could you please give a little help?

@timmeinhardt
Copy link
Author

@YanwenZhu Sorry, but I did not reimplement the original experiments. I applied Learning to learn to an unrelated problem.

@YanwenZhu
Copy link

@timmeinhardt Thanks anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants