Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023) #1829

Open
2 tasks done
albertochiappa opened this issue Feb 9, 2024 · 3 comments
Open
2 tasks done
Labels
enhancement New feature or request

Comments

@albertochiappa
Copy link

🚀 Feature

I propose to include in Stable Baselines 3 an option to use Lattice exploration, an action noise that some colleagues and I have presented in this NeurIPS paper last year. Lattice introduces noise in the policy network before the last dense layer, making the action distribution a multivariate gaussian with full covariance matrix. It can improve the performance of SAC and PPO in high-dimensional environments with many actuators. In particular, we have been using it with success in the musculoskeletal simulation library MyoSuite, where we benchmarked it together with recurrent PPO and obtained good results:

myosuite_learning_curves

We also tested together with SAC in the common PyBullet locomotion environments, where it is especially competitive in Humanoid:

pybullet_learning_curves_with_pink

It also powered our winning solution to the NeurIPS MyoChallenge 2023.

Motivation

It would be easier for the users of SB3 to test Lattice in their environment of choice if it is part of the library, vs installing a separate package or downloading another repository. The change does not break any of the current behavior of the library, as the feature is incremental.

Pitch

I have tried my best to integrate Lattice in SB3 modifying the codebase as little as possible. In the branch feature/lattice of this fork of SB3 I have implemented Lattice for SAC and PPO. It can be used by setting the argument "use_lattice=True" and passing additional hyperparameters in a dictionary called "lattice_kwargs". It seems to work correctly when called from the configuration files of SB3 zoo. I would invite a SB3 developer to check whether the integration I propose follows the library's guidelines and spirit. If you have no major concern, I would be happy to prepare a pull request!

Alternatives

Alternatively, Lattice could become part of the contrib repository of SB3. However, I don't see  a way to implement it this way without creating entirely new algorithms (e.g., LatticePPO, LatticeSAC, …), which is, in my opinion, excessive, given that relatively limited changes have to be implemented in the original algorithms to enable this option.

Additional context

No response

Checklist

  • I have checked that there is no similar issue in the repo
  • If I'm requesting a new feature, I have proposed alternatives
@albertochiappa albertochiappa added the enhancement New feature or request label Feb 9, 2024
@AlexEMG
Copy link

AlexEMG commented Feb 12, 2024

Thanks for considering the feature proposal. We're looking forward to hear your feedback!

@araffin
Copy link
Member

araffin commented Feb 13, 2024

Hello,
thanks for the proposal.
I still need to read the paper, but I would welcome anyway if you could do a PR for the project section of the documentation =) (combining Lattice exploration and MyoChallenge)

@araffin
Copy link
Member

araffin commented Feb 15, 2024

I still need to read the paper, but I would welcome anyway if you could do a PR for the project section of the documentation =) (combining Lattice exploration and MyoChallenge)

I have read the paper but I still need some time to process the content (I'll probably will be back with some questions), in the meantime, I would be happy to receive a PR that updates the project section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants