Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solver issues #90

Open
billbrod opened this issue Jan 26, 2024 · 0 comments
Open

Solver issues #90

billbrod opened this issue Jan 26, 2024 · 0 comments

Comments

@billbrod
Copy link
Member

Copying @ahwillia comments from flatironinstitute/nemos-workshop-feb-2024#8

You can come up with convex problems for which gradient descent takes essentially forever to converge but second-order methods (e.g. Newton) perform well. Thus, in theory it really can matter which algorithm you choose! 🙂

Ideally, I would like to delete this part of the tutorial and engineer around this problem. At the end of model.fit we should check that the gradient is zero within some tolerance (specified at model initialization). If the norm of the gradient is above this tolerance we throw a really massive warning explaining that they should try a different optimization method / solver et cetera.

We may want to have a separate docs page that is focused on "debugging optimization failures" -- the warning message could link them to that. My guess is that optimization failures result from (a) using float32 and not float64, (b) not having enough regularization so the problem is only weakly convex - adding regularization should make it strictly convex, (c) problems with jaxopt that should be fixed (e.g. it seems like line search does a bad job if the initial learning rate is not tuned well?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant