Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prioritized Experience Replay for DQN #1242

Open
1 task done
vnvdev opened this issue Dec 26, 2022 · 10 comments 路 May be fixed by #1622
Open
1 task done

Prioritized Experience Replay for DQN #1242

vnvdev opened this issue Dec 26, 2022 · 10 comments 路 May be fixed by #1622
Labels
enhancement New feature or request help wanted Help from contributors is welcomed

Comments

@vnvdev
Copy link

vnvdev commented Dec 26, 2022

馃殌 Feature

Prioritized Experience Replay for DQN

Motivation

No response

Pitch

No response

Alternatives

No response

Additional context

No response

Checklist

  • I have checked that there is no similar issue in the repo
@vnvdev vnvdev added the enhancement New feature or request label Dec 26, 2022
@qgallouedec
Copy link
Collaborator

It's planned, contributions are welcome 馃檪

@araffin
Copy link
Member

araffin commented Dec 26, 2022

See #622

@AlexPasqua
Copy link
Contributor

@araffin @qgallouedec
Hello, are there any news on prioritized experience replay, or you're still waiting for contributions?

@araffin araffin added the help wanted Help from contributors is welcomed label Mar 17, 2023
@araffin
Copy link
Member

araffin commented Mar 18, 2023

or you're still waiting for contributions?

We are welcoming contributions =)
I guess adapting https://github.com/Howuhh/prioritized_experience_replay from @Howuhh would be a good contribution.

@emrul
Copy link

emrul commented Mar 21, 2023

Hi @araffin - just looking at this. How would you go about it in relation to the vectorised replay buffer that SB3 uses: have one segment tree hold priorities across all envs or have a segment tree per environment? I had a cursory look at how Tianshou does it and it appears to be a segment tree per environment (at least at first glance).

@araffin
Copy link
Member

araffin commented Mar 29, 2023

How would you go about it in relation to the vectorised replay buffer that SB3 uses: have one segment tree hold priorities across all envs or have a segment tree per environment? I had a cursory look at how Tianshou does it and it appears to be a segment tree per environment (at least at first glance).

not sure, I need to take a deeper look, but probably once for all if possible or whatever is cleaner/fast enough.
We might need to do something similar to: #704

@mkhlyzov
Copy link

Hi @araffin - just looking at this. How would you go about it in relation to the vectorised replay buffer that SB3 uses: have one segment tree hold priorities across all envs or have a segment tree per environment? I had a cursory look at how Tianshou does it and it appears to be a segment tree per environment (at least at first glance).

I think it might matter depending on replacement strategy. Do you override the latest observation or the one with lowest priority?
What happens if VecEnv holds different environments? E.g. LunarLander with different gravity / wind parameters. If one environment is significantly more difficult compared to others, then wouldn't joint buffer be skewed toward it? "Hard overall" observations vs "hard for each on average" observations.
It's more of a theoretical question though.

@AlexPasqua
Copy link
Contributor

We are welcoming contributions =) I guess adapting https://github.com/Howuhh/prioritized_experience_replay from @Howuhh would be a good contribution.

Hello @araffin,
since I've recently used and contributed to @Howuhh 's PER implementation, and since I'm also familiar with SB3 (having contributed before), I could work on its adaptation for this library!
(and maybe @Howuhh wants to join as well?)

@Howuhh
Copy link

Howuhh commented Jul 20, 2023

@AlexPasqua even though I think it's very important, I'm unfortunately busy integrating Minari to CORL at the moment, so I'm unlikely to find the time to do it. But I'm glad if my implementation will be useful!

@AlexPasqua
Copy link
Contributor

@AlexPasqua even though I think it's very important, I'm unfortunately busy integrating Minari to CORL at the moment, so I'm unlikely to find the time to do it. But I'm glad if my implementation will be useful!

Alright, no problem, I'll do it myself :)

@AlexPasqua AlexPasqua linked a pull request Jul 23, 2023 that will close this issue
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Help from contributors is welcomed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants