Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adan Optimizer #401

Open
joaogui1 opened this issue Aug 24, 2022 · 8 comments · May be fixed by #410
Open

Adan Optimizer #401

joaogui1 opened this issue Aug 24, 2022 · 8 comments · May be fixed by #410

Comments

@joaogui1
Copy link
Contributor

Is there any interest in adding the Adan optimizer to optax? If so I can do it

@mkunesch
Copy link
Member

Interesting, I think that would be great! Thanks a lot!

Let us know if you'd like to discuss anything about the implementation as you write it. How much of the Adam code do you think can be reused?

@mtthss
Copy link
Collaborator

mtthss commented Aug 30, 2022

This looks very interesting, it would be an amazing contribution!

@joaogui1 joaogui1 linked a pull request Sep 4, 2022 that will close this issue
@adam-hartshorne
Copy link

I see there is an issue with the replicability of the pull request. It appears there is another implementation in jax for optax here, which might be worth looking at.

https://github.com/hr0nix/optax-adan

@joaogui1
Copy link
Contributor Author

Thanks for the pointer @adam-hartshorne! Sadly when testing that implementation on my colab it gets a 3 order of magnitude larger error compared to mine (not sure where the difference comes from, the only difference I can see is the epsilon placement)

@hr0nix
Copy link

hr0nix commented Sep 18, 2022

Hi, author of optax-adan here. Is it possible to share a collab where you've compared both implementations? I'd like to figure out where does the difference come from.

@hr0nix
Copy link

hr0nix commented Sep 18, 2022

Oh, found the link in the pull request, so no worries.

@hr0nix
Copy link

hr0nix commented Sep 18, 2022

Yep, the difference between implementations comes from epsilon placement. If I move it outside sqrt, the results are equal.

@hr0nix
Copy link

hr0nix commented Sep 18, 2022

After this change is merged and released, I'll put a note in README.md of optax-adan that there is no need to use the package as adan is now implemented in optax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants