Skip to content

WilsonWangTHU/mbbl-metrpo

Repository files navigation

Model-Ensemble Trust-Region Policy Optimization (ME-TRPO)

This repo is based on the original paper Kurutach, Thanard, et al. "Model-Ensemble Trust-Region Policy Optimization." arXiv preprint arXiv:1802.10592 (2018).link.

We modified the repo to perform benchmarking as part of the Model Based Reinforcement Learning Benchmarking Library (MBBL). Please refer to the project page for more information.

We also recommend reading of this repo, which is the repo shared by the authors of METRPO

Authors

Xuchan Bao

Guodong Zhang

Tingwu Wang

Prerequisites

You need a MuJoCo license, and download MuJoCo 1.31. from https://www.roboti.us/. Useful information for installing MuJoCo can be found at https://github.com/openai/mujoco-py.

Create a Conda environment

It's recommended to create a new Conda environment for this repo:

conda create -n <env_name> python=3.5

Or you can use python 3.6.

Install package dependencies

pip install -r requirements.txt

Then please go to MBBL to install the mbbl package for the environments.

Run benchmarking

To run the benchmarking environments, please refer to ./metrpo_gym_search_new.sh.

Run other experiments

Run experiments using the following command:

python main.py --env <env_name> --exp_name <experiment_name> --sub_exp_name <exp_save_dir>

  • env_name: one of (half_cheetah, ant, hopper, swimmer)
  • exp_name: what you want to call your experiment
  • sub_exp_name: partial path for saving experiment logs and results

Experiment results will be logged to ./experiments/<exp_save_dir>/<experiment_name>

e.g. python main.py --env half_cheetah --exp_name test-exp --sub_exp_name test-exp-dir

Change configurations

You can modify the configuration parameters in configs/params_<env_name>.json.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published