Model-Ensemble Trust-Region Policy Optimization (ME-TRPO)

This repo is based on the original paper Kurutach, Thanard, et al. "Model-Ensemble Trust-Region Policy Optimization." arXiv preprint arXiv:1802.10592 (2018).link.

We modified the repo to perform benchmarking as part of the Model Based Reinforcement Learning Benchmarking Library (MBBL). Please refer to the project page for more information.

We also recommend reading of this repo, which is the repo shared by the authors of METRPO

Authors

Xuchan Bao

Guodong Zhang

Tingwu Wang

Prerequisites

You need a MuJoCo license, and download MuJoCo 1.31. from https://www.roboti.us/. Useful information for installing MuJoCo can be found at https://github.com/openai/mujoco-py.

Create a Conda environment

It's recommended to create a new Conda environment for this repo:

conda create -n <env_name> python=3.5

Or you can use python 3.6.

Install package dependencies

pip install -r requirements.txt

Then please go to MBBL to install the mbbl package for the environments.

Run benchmarking

To run the benchmarking environments, please refer to ./metrpo_gym_search_new.sh.

Run other experiments

Run experiments using the following command:

python main.py --env <env_name> --exp_name <experiment_name> --sub_exp_name <exp_save_dir>

env_name: one of (half_cheetah, ant, hopper, swimmer)
exp_name: what you want to call your experiment
sub_exp_name: partial path for saving experiment logs and results

Experiment results will be logged to ./experiments/<exp_save_dir>/<experiment_name>

e.g. python main.py --env half_cheetah --exp_name test-exp --sub_exp_name test-exp-dir

Change configurations

You can modify the configuration parameters in configs/params_<env_name>.json.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Model-Ensemble Trust-Region Policy Optimization (ME-TRPO)

Authors

Prerequisites

Create a Conda environment

Install package dependencies

Run benchmarking

Run other experiments

Change configurations

Files

README.md

Latest commit

History

README.md

File metadata and controls

Model-Ensemble Trust-Region Policy Optimization (ME-TRPO)

Authors

Prerequisites

Create a Conda environment

Install package dependencies

Run benchmarking

Run other experiments

Change configurations