Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify EmbeddingDropout and adapt AWD_LSTM to trigger hooks in Embedding layers #2906

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

arnaujc91
Copy link

Fix for: #2850

@sutt already fixed the awd_lstm.WeightDropout class to be able to run the ActivationStats callback for the model AWD_LSTM. Still the Embedding layers were not recording any statistics. Furthermore the Hooks class had duplicated layers due to how flatten_model works.

In this PR I propose a modification to solve both issues:

  1. Trigger the hooks for the Embedding layer
  2. Avoid dupicated layers in the Hooks class.

An example of this can be found here: example EmbeddingDropout.

To solve the issue I just modified slightly the class EmbeddingDropout and adapted AWD_LSTM to work exactly the same as before for these changes. I think now the code is more compact than before.

I by no means have your expertise, nevertheless I tried to do my best refactoring the code. If there is anything that I did wrong just let me know.

Thanks a lot!

@arnaujc91 arnaujc91 requested a review from jph00 as a code owner October 26, 2020 11:42
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jph00
Copy link
Member

jph00 commented Oct 26, 2020

Thanks @arnaujc91 . Can you please fix the test failure shown here in CI? It's saying ModuleAttributeError: 'AWD_LSTM' object has no attribute 'encoder_dp'. Also, could you add to the notebook some basic docs and a test or example for EmbeddingDropout so that I can understand the new code you've added?

@arnaujc91
Copy link
Author

arnaujc91 commented Oct 26, 2020

Ok, I have created a notebook where I tried to explain as good as possible why I am doing this PR and what are the possible solutions that I see to the problem.

See the following notebook: EmbeddingDropout

P.S. I updated again the notebook. Let me know what you think. Thanks a lot!

@jph00 jph00 closed this Oct 31, 2020
@jph00 jph00 reopened this Oct 31, 2020
@jph00 jph00 closed this Nov 3, 2020
@jph00 jph00 reopened this Nov 3, 2020
@jph00
Copy link
Member

jph00 commented Nov 3, 2020

Thanks @arnaujc91 - I'm still seeing the test failure however. Would you be able to fix that?

Copy link
Member

@jph00 jph00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix test failure:

ModuleAttributeError: 'AWD_LSTM' object has no attribute 'encoder_dp'

@arnaujc91
Copy link
Author

I modified the PR to pass the previous test, but now is failing in another test:

      1 enc = nn.Embedding(10, 7, padding_idx=1)
----> 2 enc_dp = EmbeddingDropout(enc, 0.5)
      3 tst_inp = torch.randint(0,10,(8,))
      4 tst_out = enc_dp(tst_inp)
      5 for i in range(8):

This is because the class EmbeddingDropout expects that you pass it an instance of nn.Embedding to the constructor but precisely the point of my PR is that the new EmbeddingDropout does not need an instance of nn.Embedding. So either I can change this test or the tests are too rigid for this PR to make sense.

As I explained in the previous Notebook there are two possibilities to fix the ActivationStats problem:

  1. Modify flatten_model
  2. Modify EmbeddingDropout

I decided to modify EmbeddingDropout. The reason was twofold: it simplified the code and it solved the issue with the hooks. My current modification just affects EmbeddingDropout; AWD_LST stays the same. Do you really need that EmbeddingDropout requires an instance of nn.Embedding in other parts of the library?

If the tests can't be modified then I will need to close this PR and maybe check if i can modify flatten_model instead.

Can you or me modify the tests for the layer EmbeddingDropout?

Thanks a lot Jeremy.

@jph00
Copy link
Member

jph00 commented Nov 3, 2020 via email

@arnaujc91
Copy link
Author

Please before accepting the modifications of this PR check the Notebook I created explaining how to make full sense of this PR. Right now everything works, but the code can be further cleaned.

Here is the notebook: encoder and encoder_dp are redundant

Thanks a lot!

@hamelsmu hamelsmu added the bug label May 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants