Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in schedules API #835

Open
1 of 4 tasks
fabianp opened this issue Feb 29, 2024 · 0 comments
Open
1 of 4 tasks

Inconsistencies in schedules API #835

fabianp opened this issue Feb 29, 2024 · 0 comments

Comments

@fabianp
Copy link
Member

fabianp commented Feb 29, 2024

  • For most schedules, the end value is determined with parameter end_value, but for cosine_decay it's called alpha. : deprecate kwarg alpha in cosine_decay_schedule in favor of end_value #870
  • For most schedules, the total number of steps is specified through the transition_steps parameter, but in some cases (e.g., optax.cosine_decay_schedule, optax.warmup_cosine_decay_schedule but confusingly not optax.cosine_onecycle_schedule) it's called decay_steps instead.
  • The name sgdr_schedule is not descriptive of what the schedule actually does.
  • Most warm-up learning rates like linear_onecycle_schedule and cosine_onecycle_schedule specify the length of the warm-up phrase using parameter pct_start , but warmup_cosine_decay_schedule instead specifies it through a parameter warmup_steps

In the documentation:
5. In the API reference https://optax.readthedocs.io/en/latest/api/optimizer_schedules.html there's a section "Schedules with warm-up". I would consider optax.cosine_onecycle_schedule to have warm-up, yet it's not in this section. My recommendation would be to remove the section ""Schedules with warm-up" and put optax.warmup_cosine_decay_schedule in the Cosine decay schedule section and optax.warmup_exponential_decay_schedule in the exponential decay section

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant