TAAC: Temporally abstract actor-critic for continuous control

This repo releases the code for

TAAC: Temporally Abstract Actor-Critic for Continuous Control, Yu et al., NeurIPS 2021.

It also contains the experiment configuration files for training TAAC on 5 categories of 14 continuous control tasks as done in the paper.

What is TAAC?

In a nutshell, TAAC is an off-policy (sample efficient!) actor-critic algorithm that has closed-loop action repetition (temporal abstraction!) built in.

TAAC is in the middle ground between "flat" RL (e.g., SAC) and hierarchical RL (e.g., options, goals, etc).
TAAC is conceputally simple. Its implementation closely resembles SAC.
TAAC natively supports unbiased multi-step TD backup, with a novel compare-through operator!

Highlights of TAAC

TAAC largely outperformed several strong baselines on 14 complex continuous control tasks:

TAAC learns to skip learning to generate new actions at non-critical states, and save the actor network’s representational power for critical states!

More highlights can be found on this poster.

A detailed walkthrough of TAAC is in this video.

Installation

Our experiments use the training pipelines and algorithms of Agent Learning Framework (ALF). Python3.7+ is currently supported by ALF and Virtualenv is recommended for the installation. After activating a virtual env, download and install ALF:

git clone https://github.com/HorizonRobotics/alf
cd alf
git checkout fb30ce1 -B taac
pip install -e .

On top of the basic ALF installation,

One task category Terrain requires installing box2d-py.
Two task categories (Manipulation and Locomotion) require installing Mujoco. Our experiments use Mujoco 2.0 and a different version might result in a different training result. So we suggest using this exact version for the reproduction purpose. Please follow the instructions at https://github.com/openai/mujoco-py.
One task category Driving requires installing CARLA and we used version 0.9.9 in the experiments. Installation instructions can be found in <ALF_ROOT>/alf/environments/suite_carla.py.

After the installation, clone this repo under ALF:

cd <ALF_ROOT>/alf/examples
git clone https://github.com/hnyu/taac

Run experiments

To run an experiment (e.g., training TAAC on BipedalWalker-v2):

cd <ALF_ROOT>/alf/examples
python -m alf.bin.train --root_dir=<TRAIN_JOB_DIR> --gin_file taac/experiments/taac/taac_terrain.gin --gin_param="create_environment.env_name='BipedalWalker-v2'"

Then open the Tensorboard to view the training results

tensorboard --logdir=<TRAIN_JOB_DIR>

Tasks

The 14 tasks can be trained by providing the corresponding environment names to the 5 gin files

gin file	`create_environment.env_name`
`<methdod>_simple_control.gin`	"MountainCarContinuous-v0" "LunarLanderContinuous-v2" "InvertedDoublePendulum-v2"
`<method>_locomotion.gin`	"Hopper-v2" "Ant-v2" "Walker2d-v2" "HalfCheetah-v2"
`<method>_terrain.gin`	"BipedalWalker-v2" "BipedalWalkerHardcore-v2"
`<method>_manipulation.gin`	"FetchReach-v1" "FetchPush-v1" "FetchSlide-v1" "FetchPickAndPlace-v1"
`<method>_driving.gin`	"Town01"

Code reading

The entire TAAC algorithm is implemented in the file alf/algorithms/taac_algorithm.py of the ALF repo downloaded.

Troubleshooting

Sometimes running a job complains not finding rsync (ALF uses rsync to backup training code), you just need to first install it and try again. Or simply append the flag --nostore_snapshot when launching the job.
CARLA "Fail to start server": just give it another try.
If any error related to not finding Python.h during pip installing ALF, please first install the python development package, e.g., sudo apt install python3.7-dev.

Citation

If you use TAAC in the research, please consider citing

@inproceedings{Yu2021TAAC,
    author={Haonan Yu and Wei Xu and Haichao Zhang},
    title={TAAC: Temporally Abstract Actor-Critic for Continuous Control},
    booktitle={NeurIPS},
    year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
experiments		experiments
images		images
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiments

experiments

images

images

LICENSE

LICENSE

README.md

README.md

Repository files navigation

TAAC: Temporally abstract actor-critic for continuous control

What is TAAC?

Highlights of TAAC

Installation

Run experiments

Tasks

Code reading

Troubleshooting

Citation

About

Releases

Packages

License

hnyu/taac

Folders and files

Latest commit

History

Repository files navigation

TAAC: Temporally abstract actor-critic for continuous control

What is TAAC?

Highlights of TAAC

Installation

Run experiments

Tasks

Code reading

Troubleshooting

Citation

About

Topics

Resources

License

Stars

Watchers

Forks