We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
The formula in the documentation for the cosine_decay_schedule (https://optax.readthedocs.io/en/latest/api/optimizer_schedules.html#optax.cosine_decay_schedule) would suggest that the learning rate increases again after T steps.
cosine_decay_schedule
A quick look at the code confirms this is not the case, but it may be good to write it explicitly, as in linear_schedule.
linear_schedule
Happy to make a short PR! I also could propose a short formula/pseudocode for functions like piecewise_constant_schedule that do not have one.
piecewise_constant_schedule
Best
GJ
The text was updated successfully, but these errors were encountered:
Hello @gjhuizing,
Thanks for catching this! If you are willing to do such a PR that would be great!
Sorry, something went wrong.
Great!
No branches or pull requests
Hello,
The formula in the documentation for the
cosine_decay_schedule
(https://optax.readthedocs.io/en/latest/api/optimizer_schedules.html#optax.cosine_decay_schedule) would suggest that the learning rate increases again after T steps.A quick look at the code confirms this is not the case, but it may be good to write it explicitly, as in
linear_schedule
.Happy to make a short PR! I also could propose a short formula/pseudocode for functions like
piecewise_constant_schedule
that do not have one.Best
GJ
The text was updated successfully, but these errors were encountered: