New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
register_forward_hook in AWD_LSTM not working #2850
Comments
But the implementation of fastai does not do it that way. Please try to run the following code:
The result to me is:
Basically the forward hooks do not work for almost all the layers in the fastai implementation of ActivationStats. That was my main motivation of this post, because from the beginning this piece of code was not working (as expected). Please, in case I am doing something wrong tell me (or is this a mistake?). Thanks! |
These are excellent examples. I am reproducing your results exactly. In fastai.callback.hook.py, this caught my eye:
|
Well the expected behaviour is that you get the statistics of the activations for each of the layers. So instead of a I kind of went deep into the Language model and realized that this happens because of the following two layers in AWD_LSTM:
If you remove these dropouts the forward hooks work fine for the AWD_LSTM architecture. Please let me know how can we fix this. Thanks a lot for your time, it is really appreciated! |
I am not able to reproduce your steps above. My initial testing is showing that although: |
Ok Let's try the following. First: I modify the AWD_LSTM class in the following way:
As you can see I have just modified 2 lines from the original code. Now lets make the following experiment:
You will see how the modified Check if this also happens for you. Then we would have to see why these two Dropouts do not retain the hooks, could be from PyTorch maybe? I do not know at this point i just stopped it. Would be really could if we could figure it out! :) |
Thanks for this! Let me take a look at it.
|
I've found the problem. There are two lines of code you need to edit. First, the main problem comes from Second we need to handle the tuple types as output for LSTM layers. So alter the
This gives you largely what you're looking for. |
I think basically all the Last issue is that I do not really know what the layers ParameterModule are. Summary: For sure the first two layers should collect statistics of the weights of the Embeddings, and my bet would be that no element of the list should be Can we maybe ask this to the people who implemented this functionality? |
Hey @sutt ! I saw your pull request. I wanted to ask you what about the Embedding layer? I have spotted the problem: basically EmbeddingDropout is not captured by The hooks are actually working but the method Therefore there are just two options, either modify Any ideas?? I have been thinking about it but i do not consider myself experienced enough to perform a decent PR. Thanks a lot @sutt ! |
@arnaujc91 Thanks for these great examples. |
Found a solution! Just wrote a new
As you can see I just slightly modified the class Besides I think is also a more clean solution: instead of using two layers On the other hand I do not know if Let me know what you think! :) |
@arnaujc91
I've searched the repository for an example of using Nice work! |
Hey @sutt ! encoder_dp is just used if Yes I saw that the test failed because there must be a Thanks a lot Will, I have learned a lot doing this! Thank you for your time and patience! |
In progress in #2906 thanks to @arnaujc91 |
Why are the forward hooks not working for the
AWD_LSTM
model?I try the following:
and the output is just the result of forwarding the input, no hooks activated.
Output:
Clearly the forward hook is not working. Why?
The text was updated successfully, but these errors were encountered: