Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle num_envs > 1 in DQN #395

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ronuchit
Copy link

@ronuchit ronuchit commented Jun 6, 2023

Description

We remove the requirement that num_envs=1 in the DQN implementation. See discussion in:
#370 (comment)

Testing

I ran python cleanrl/dqn.py --env-id CartPole-v1 --num-envs 2 to test the change. This was crashing before I made my change, but now runs.

Types of changes

  • [ x ] Bug fix
  • New feature
  • New algorithm
  • Documentation

Checklist:

  • [ x ] I've read the CONTRIBUTION guide (required).
  • I have ensured pre-commit run --all-files passes (required).
  • I have updated the tests accordingly (if applicable).
  • I have updated the documentation and previewed the changes via mkdocs serve.
    • I have explained note-worthy implementation details.
    • I have explained the logged metrics.
    • I have added links to the original paper and related papers.

@vercel
Copy link

vercel bot commented Jun 6, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
cleanrl ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 6, 2023 9:01pm

@sdpkjc
Copy link
Collaborator

sdpkjc commented Jun 7, 2023

Here we may need to pay attention to the problem of the increment of global_step (+1? or +args.num_envs?)

  • If +args.num_envs, then we need to consider the divisibility relationship between global_step and log recording points (for example: if global_step % 100 == 0:), and the divisibility relationship between global_step and args.train_frequency (if global_step % args.train_frequency == 0:)
    • We should give some notes or warnings when we cannot be divided.
  • If +1, then we need to consider whether our total_timesteps is still correct. We may need to write the log like this: writer.add_scalar("charts/episodic_return", info["episode"]["r"], global_step*args.num_envs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants