-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement e-prop #226
Comments
I think this can be implemented rather easily by adding a Let me know what you think, I can then work on a draft. |
I think the general idea could work but with the current norse architecture we could probably follow the super spike approach https://github.com/norse/norse/blob/master/norse/torch/functional/threshold.py adding eprop as a new method for the threshold function Than in the forward we just save the input tensor for the backward pass Does that sounds feasible to you @Huizerd With that approach we would also avoid the case where we have to add something to all SNNCell types but instead let the autograd function take care of this. |
@skiwan I don't follow you, could you elaborate/give an example? :) How could you specify when to block/not block gradients with the same call to the threshold? |
Im also quite new to this (eprop as well as how pytorch and autograd works) so I might not be 100% correct with how I am thinking but right now for example with the LIF layer Than similar to the superspike method https://github.com/norse/norse/blob/master/norse/torch/functional/superspike.py We can do all the calculations needed for eprop there. Does that make more sense? Or could you explain a bit more in detail from your side how you would approach this |
If we're keen on stopping gradients, I agree this could/should be handled on a module level. And we're actually right now forcing the voltage tensor to retain the gradient graph on a module level. What I'm not 100% clear on is how it composes with other layers. If you If that's the case, could it be solved by storing the previous tensor, cloning it in an E-Prop module, and then reusing the previous tensor in a future backwards step, like so? Is that close to what you meant @Huizerd?
|
e-prop only blocks gradients between timesteps (for the reset for instance), not between layers. Think of it as a 1-step truncated BPTT. So @Jegp I think this shouldn't be a problem for other layers in the network. @Jegp I was always wondering about that @skiwan The SuperSpike in Norse is just the shape of their surrogate gradient, not the actual learning method they implement in that paper. Your're right in that the learning methods of SuperSpike and e-prop are similar. I didn't look into their implementation though; I only know that e-prop can be done by blocking gradients in certain places. |
I see, that makes a lot of sense, actually. And if it's only related to the previous spikes, wouldn't it be possible to even create a wrapper module that "just" detaches the output spikes and caches them in the state? The line exists purely for convenience. The state tensors are, theoretically, leaf tensors in that they are not strictly associated with the module and their gradients won't accumulate. This is basically one way of forcing Torch to include the state tensors in the gradient computations (https://pytorch.org/docs/stable/notes/autograd.html). I'm not ecstatic about the solution, so I'm curious to know whether there are better ways of achieving the same. |
I guess there could be an extended state that also includes detached spikes, because the problem is that you need both attached and detached spikes (following the original e-prop implementation). Not sure if this would be nice, and you would still need some check in the neuron function on which spikes to use where I think. Another solution could be to have an entirely different function for e-prop, say
I noticed that, for e.g. |
This https://github.com/ChFrenkel/eprop-PyTorch, might be a good starting point. |
Implement e-prop as in https://arxiv.org/pdf/1901.09049.pdf, see also https://github.com/IGITUGraz/eligibility_propagation for an implementation
The text was updated successfully, but these errors were encountered: