Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] STAC algorithm #223

Open
2 tasks done
EloyAnguiano opened this issue Jan 4, 2024 · 4 comments
Open
2 tasks done

[Feature Request] STAC algorithm #223

EloyAnguiano opened this issue Jan 4, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@EloyAnguiano
Copy link

馃殌 Feature

Build the STAC algorithm as a callable algorithm: https://arxiv.org/pdf/2002.12928.pdf

Motivation

Hyperparametrization is one of the most time/cost expensive thing when training RL agents. May be this implementation saves some time/cost to some people and it could be the first AC algorithms that deals with meta-gradients to make improvements from here.

Pitch

I would like some to guide me of where to start or to give me some key insights of the posibilities of coding this.

Alternatives

The alternatives are that someone codes it by him/herself.

Additional context

No response

Checklist

  • I have checked that there is no similar issue in the repo
  • If I'm requesting a new feature, I have proposed alternatives
@EloyAnguiano EloyAnguiano added the enhancement New feature or request label Jan 4, 2024
@araffin araffin added Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) and removed Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) labels Jan 4, 2024
@araffin
Copy link
Member

araffin commented Jan 10, 2024

Hello,
are you willing to implement and benchmark the algorithm?

@EloyAnguiano
Copy link
Author

Yes, I would like to try to do so. Is there any oficial benchmark to do so or some coding guides?

@EloyAnguiano
Copy link
Author

The algorithm is an Off-policy one. Is there any way or example to begin with this kind of algorithms?

@araffin
Copy link
Member

araffin commented Jan 12, 2024

The algorithm is an Off-policy one. Is there any way or example to begin with this kind of algorithms?

#4
and
please read https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/CONTRIBUTING.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants