Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create optimizer in OnPolicyAlgorithm only after the device is set #1771

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cmangla
Copy link

@cmangla cmangla commented Dec 4, 2023

Attempt to fix #1770 in a fully backward compatible manner.

Description

In PPO, the optimizer in the policy is created before the computation-device for the class is correctly set. This is a problem when the optimizer checks the target computation-device on initialization. This is a backward compatible fix.

Motivation and Context

Fixes #1770 . One can now use the fused option in the Adam optimizer on CUDA devices, which, according to the documentation, is faster.

  • I have raised an issue to propose this change

Types of changes

  • Bug fix

Checklist

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)
  • I have checked that the documentation builds using make doc (required)

@cmangla cmangla marked this pull request as draft December 4, 2023 17:57
@cmangla cmangla marked this pull request as ready for review December 5, 2023 10:27
@cmangla cmangla changed the title Create optimizer in PPO only after the device is set Create optimizer in OnPolicyAlgorithm only after the device is set Dec 5, 2023
@cmangla cmangla marked this pull request as draft December 5, 2023 12:16
@cmangla cmangla marked this pull request as ready for review December 5, 2023 14:01
@cmangla
Copy link
Author

cmangla commented Dec 5, 2023

@araffin This is ready for the CI tests now and potentially also to merge.

@@ -885,6 +895,7 @@ def __init__(
normalize_images: bool = True,
optimizer_class: Type[th.optim.Optimizer] = th.optim.Adam,
optimizer_kwargs: Optional[Dict[str, Any]] = None,
_init_optimizer=True, # Currently unused, see PR #1771
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@araffin I'm currently testing enabling this one too. I will update this PR accordingly, hence switching it back to draft.

@cmangla cmangla marked this pull request as draft December 6, 2023 12:07
@cmangla cmangla marked this pull request as ready for review December 7, 2023 15:03
@cmangla
Copy link
Author

cmangla commented Dec 7, 2023

@araffin Looks good now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug:] Cannot use the fused flag in default optimizer of PPO
1 participant