Overview

An implementation of Data-Efficient Hierarchical Reinforcement Learning (HIRO) in PyTorch.

Installation

Follow installation of OpenAI Gym Mujoco Installation

1. Obtain a 30-day free trial on the MuJoCo website or free license if you are a student. The license key will arrive in an email with your username and password.
2. Download the MuJoCo version 2.0 binaries for Linux or OSX.
3. Unzip the downloaded mujoco200 directory into ~/.mujoco/mujoco200, and place your license key (the mjkey.txt file from your email) at ~/.mujoco/mjkey.txt.

Install Dependencies

pip install -r requirements.txt

Run

For HIRO,

python main.py --train

For TD3,

python main.py --train --td3

Evaluate Trained Model

Passing --eval argument will read the most updated model parameters and start playing. The goal is to get to the position (0, 16), which is top left corner.

For HIRO,

python main.py --eval

For TD3,

python main.py --eval --td3

Trainining result

Blue is HIRO and orange is TD3

Succss Rate

Reward

Intrinsic Reward

Losses

Higher Controller Actor

Higher Controller Critic

Lower Controller Actor

Lower Controller Critic

TD3 Controller Actor

TD3 Controller Critic

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
envs		envs
hiro		hiro
media		media
test		test
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

envs

envs

hiro

hiro

media

media

test

test

.gitignore

.gitignore

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

Overview

Installation

Run

Evaluate Trained Model

Trainining result

Succss Rate

Reward

Intrinsic Reward

Losses

About

Releases

Packages

Contributors 2

Languages

watakandai/hiro_pytorch

Folders and files

Latest commit

History

Repository files navigation

Overview

Installation

Run

Evaluate Trained Model

Trainining result

Succss Rate

Reward

Intrinsic Reward

Losses

About

Resources

Stars

Watchers

Forks

Languages