-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-smoothness of mxPenalty functions #343
Comments
That's a nice improvement. Can you make your proposal the default and rename the current lasso to "oldLasso"? |
Awesome! I will give it a try, but it may take some time as I am not familiar enough with the OpenMx code base yet and don't want to break anything. I'll keep you updated. |
I'm happy to answer any questions or review the change. |
I've created some preliminary versions of the alternative penalty functions in a fork (https://github.com/jhorzek/OpenMx/tree/penaltyFunctions). Do you already have scripts to test all penalty functions which I could use to test my implementation? |
See
|
This is exactly the wrong approach:
We should use your new implementation by default. So move the current implementation to The naming in the C++ code can stay the same though since that's not user visible. Otherwise, looks good. |
Also, add yourself as a contributor (ctb) in DESCRIPTION.in. |
I did not know about the git grep function - thank you for the info! I wanted to leave the current implementation of the penalty functions untouched until I am fairly certain that my implementation works. I'll change all names to the ones you suggested before creating a pull request, however. |
I am sorry for leaving this issue open for so long. When implementing regularized SEM with non-smooth penalty functions, I found that the combination of fit indices and smoothed penalties can be problematic. |
Dear all,
I am very excited to see that OpenMx now also supports regularization of parameters with lasso and similar penalty functions! If I understand it correctly, OpenMx uses a smooth approximation of the non-differentiable penalty functions; the documentation states: "Smoothing is controlled by smoothProportion. If smoothProportion is zero then the traditional discontinuous functions are used. Otherwise, smoothProportion of the region between epsilon and zero is used for smoothing." I wanted to learn more about this procedure and looked into the code-files. I may be totally wrong here as I don't have much experience with the C++ code underlying OpenMx. But to my understanding, an R translation of the lasso-functions would look as follows (see penalty.cpp, lines 46 and following):
If I am not mistaken, the current implementation does result in a penalty function which has more non-differentiable points than the lasso - four instead of one:
Also, in the gradient-function it seems as if the contribution of the penalty term is set to zero close to the origin:
Note that at a parameter value of zero, typically sub-gradients would be used which include [-1,1]. If the gradients of the penalty function are set to zero, all that is left is the gradient of the log-likelihood. This gradient will be non-zero in most cases and therefore the optimizer may have difficulties spotting the actual minimum.
An alternative could be to approximate lasso penalty as proposed by Lee et al. (2006; see epsL1 on p. 405). A simple implementation could look something like this:
In R, this would look as follows:
This implementation may come with other limitations, however. In my experience, a specialized optimizer for lasso penalty functions often returns better results.
Best,
Jannik
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
The text was updated successfully, but these errors were encountered: