Name		Name	Last commit message	Last commit date
parent directory ..
figures		figures
README.md		README.md
arguments.py		arguments.py
demo.py		demo.py
models.py		models.py
running_state.py		running_state.py
train_network.py		train_network.py
trpo_agent.py		trpo_agent.py
utils.py		utils.py

README.md

Trust Region Policy Optimization (TRPO)

This is a pytorch-version implementation of Trust Region Policy Optimisation(TRPO).

Requirements

python 3.5.2
openai-gym
mujoco-1.50.1.56
pytorch-0.4.0

Installation

Install OpenAI Baselines (the openai-baselines update so quickly, please use the older version as blow, will solve in the future.)

# clone the openai baselines
git clone https://github.com/openai/baselines.git
cd baselines
git checkout 366f486
pip install -e .

Instruction to run the code

Train the Network:

python train_network.py

Test the Network

python demo.py

Download the Pre-trained Model

Please download them from the Google Driver, then put the saved_models under the current folder.

Results

Training Performance

Demo: Walker2d-v1

Note: the new-version openai-gym has problem in rendering, so I use the demo of Walker2d-v1
Tips: when you watch the demo, you can press TAB to switch the camera in the mujoco.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06-trust-region-policy-optimization

06-trust-region-policy-optimization

figures

figures

README.md

README.md

arguments.py

arguments.py

demo.py

demo.py

models.py

models.py

running_state.py

running_state.py

train_network.py

train_network.py

trpo_agent.py

trpo_agent.py

utils.py

utils.py

README.md

Trust Region Policy Optimization (TRPO)

Requirements

Installation

Instruction to run the code

Train the Network:

Test the Network

Download the Pre-trained Model

Results

Training Performance

Demo: Walker2d-v1

Files

06-trust-region-policy-optimization

Directory actions

More options

Directory actions

More options

Latest commit

History

06-trust-region-policy-optimization

Folders and files

parent directory

Trust Region Policy Optimization (TRPO)

Requirements

Installation

Instruction to run the code

Train the Network:

Test the Network

Download the Pre-trained Model

Results

Training Performance

Demo: Walker2d-v1