Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Tuning for GAIL and custom envs with time bottlenecks #92

Open
prabhasak opened this issue Jul 8, 2020 · 0 comments
Open

Comments

@prabhasak
Copy link

prabhasak commented Jul 8, 2020

Hello. I use SB and zoo actively for GAIL. My CustomEnv built using AirSim trains (almost) in real-time, due to which I have spent months trying to find the right set of hyperparameters (HPs) for GAIL to imitate expert trajectories (generated from an optimal TRPO policy). I had some specific questions regarding TRPO and GAIL

  1. Since GAIL uses TRPO, I made a copy of the zoo TRPO HPs and called it GAIL. Can I do better? I have had luck imitating simple Gym envs with GAIL, but have had a hard time imitating MuJoCo envs
  2. CustomEnv training for 1e6 timesteps takes ~1.5 days, so I've been avoiding tuning. Would you recommend tuning for GAIL? Do I just copy the trpo sampler for gail? Is there anything else I can do to speed-up tuning?
  3. With both a lack of tuned HPs and real-time training, is there any other avenue I can try my hands on to get GAIL to work on my CustomEnv?

Any help is greatly appreciated. Thank you for these awesome repos!

CustomEnv info:
obs: 6dim, cts
action: 3dim, cts
rewards: dense, large reward at goal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant