Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline Reinforcement Learning (NeurIPS 2023)

TSRL (https://arxiv.org/abs/2306.04220) introduces a new offline reinforcement learning (RL) algorithm that leverages the fundamental symmetry of system dynamics to enhance performance under small datasets. The proposed Time-reversal symmetry (T-symmetry) enforced Dynamics Model (TDM) establishes consistency between forward and reverse latent dynamics, providing well-behaved representations for small datasets. TSRL achieves impressive performance on small benchmark datasets with as few as 1% of the original samples, outperforming recent offline RL algorithms in terms of data efficiency and generalizability.

Usage

To install the dependencies, use

    pip install -r requirements.txt

1.Get small samples

You can download the small samples from the ''utils/small_samples/'' directly.

Or if you want to generate by yourself:

    bash utils/generate_loco.sh # For the locomotion tasks

and

    bash utils/generate_adroit.sh # For the adroit tasks

2.Train TDM models

You can train TDM simply from:

    bash TDM/train_loco.sh # For the locomotion tasks

and

    bash TDM/train_adroit.sh #  For the adroit tasks

3.Run TSRL on Benchmark experiments

After you have your own small samples as well as a trained TDM model, you can run TSRL on D4RL tasks by:

    bash tsrl_loco.sh # For the locomotion tasks

and

    bash tsrl_adroit.sh # For the locomotion tasks

Visulization of Learning curves

You can resort to wandb to login your personal account via export your own wandb api key.

export WANDB_API_KEY=YOUR_WANDB_API_KEY

and run

wandb online

to turn on the online syncronization.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
TDM		TDM
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
actor_critic_net.py		actor_critic_net.py
requirements.txt		requirements.txt
sample_from_dataset.py		sample_from_dataset.py
tsrl_adroit.sh		tsrl_adroit.sh
tsrl_algos.py		tsrl_algos.py
tsrl_loco.sh		tsrl_loco.sh
tsrl_train.py		tsrl_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TDM

TDM

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

actor_critic_net.py

actor_critic_net.py

requirements.txt

requirements.txt

sample_from_dataset.py

sample_from_dataset.py

tsrl_adroit.sh

tsrl_adroit.sh

tsrl_algos.py

tsrl_algos.py

tsrl_loco.sh

tsrl_loco.sh

tsrl_train.py

tsrl_train.py

Repository files navigation

Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline Reinforcement Learning (NeurIPS 2023)

Usage

1.Get small samples

2.Train TDM models

3.Run TSRL on Benchmark experiments

Visulization of Learning curves

About

Releases

Packages

Contributors 2

Languages

License

pcheng2/TSRL

Folders and files

Latest commit

History

Repository files navigation

Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline Reinforcement Learning (NeurIPS 2023)

Usage

1.Get small samples

2.Train TDM models

3.Run TSRL on Benchmark experiments

Visulization of Learning curves

About

Resources

License

Stars

Watchers

Forks

Languages