-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Dropout/DropConnect #413
Comments
Did the neural net people really have to assign a new phrase to the idea of overfitting? 😄 |
In 7de290f I wrote the Dropout layer, so the ticket should focus on the Dropconnect implementation. |
@zoq, Hello, I was wondering if this task is available to solve. If so, I'm willing to do so. |
I've added the Dropout layer in 7de290f but I didn't have a chance to implement Dropconnect. If you like you can implement Dropconnect. |
I got the difference @zoq between the two. One has probability multiplications in weights(Connect) and the other has it in activation functions(Out). Can you guide me where to make changes though? According to me the input activation should be modified and then assigned to output activation. Please guide me a bit coz I want to fix this now. |
@theaverageguy, I am working on this right now. I will be sending a PR very soon regarding this bug .. |
@theaverageguy you are right, I really like the images from the authors: http://cs.nyu.edu/~wanli/dropc/. So, the implementation of the DropConnectLayer isn't that different as of the DropoutLayer. So you can use the DropoutLayer as a basis. So imagine you like to create a simple feedforward network something like that:
Now we would like to use DropConnect between the input and the first hidden layer, so we what we need to do here is to set randomly weights from the inputLayer to 0. So let us modify our feedforward network so that it uses this new DropConnectLayer:
As you can see the Constructor of the DropConnectLayer is similar to the DropoutLayer but takes an additional parameter:
In this case, we use layer (LinearLayer) inside the DropConnect Layer for the weight modification. So the
We modify the weights of the layer and then use the I hope this is helpful. |
Not sure I get your point, but what if I would like DropConnect for the ConvLayer? |
Ok, I got it .. thanks .. |
In drop_connect_layer.hpp, we randomly drop weights instead of units.This layer is based on linear_layer.hpp, with the exception that the weights matrix is multiplied by the mask.
Looks, good, the DropConnectLayer is connection, since it connects two layer. Btw. I really like your commit message. |
@zoq is this issue fixed or is there something I can work on. I am willing to contribute for the same. |
@chvsp Sorry, @palashahuja is working on the issue. The problem is, I can't assign anyone who isn't already part of mlpack. |
@zoq I am a GSOC 2016 aspirant and as the PR hasn't been merged for long I thought there would be something which needs work. I don't understand what you meant by "The problem is, I can't assign anyone who isn't already part of mlpack." . Like is there something I should do to be eligible of fixing issues? |
@chvsp: it seems to be a shortcoming of Github; see https://help.github.com/articles/assigning-issues-and-pull-requests-to-other-github-users/ :
|
Dropout is a recently introduced algorithm to prevent co-adaptation during training (overfitting). The key idea is to randomly drop units, along with their connections from a neural network during training. Routhly each element of a layer's output is kept with probability p, otherwise it's being set to 0.
For more information see:
A simple way to implement the technique is to introduce a new function which creates a dropOutMask. Afterwards, we can multiply the dropOutMask with the inputActivation in all layers which should support dropout. Something like:
DropConnect is a generalization of Dropout that takes the idea a step further. Rather than zeroing each unit activations with probability p, it zeroes the weights/connections with probability p.
For more information see:
The idea to implement the technique is similar, except that we need to introduce the feature to all connections which should support dropConnect. The modified code should look something like:
The text was updated successfully, but these errors were encountered: