This is a pytorch-version implementation of Trust Region Policy Optimisation(TRPO).
- python 3.5.2
- openai-gym
- mujoco-1.50.1.56
- pytorch-0.4.0
Install OpenAI Baselines (the openai-baselines update so quickly, please use the older version as blow, will solve in the future.)
# clone the openai baselines
git clone https://github.com/openai/baselines.git
cd baselines
git checkout 366f486
pip install -e .
python train_network.py
python demo.py
Please download them from the Google Driver, then put the saved_models
under the current folder.
Note: the new-version openai-gym has problem in rendering, so I use the demo of Walker2d-v1
Tips: when you watch the demo, you can press TAB to switch the camera in the mujoco.