Skip to content

liuzuxin/DSRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Python 3.8+ License PyPI GitHub Repo Stars Downloads


DSRL (Datasets for Safe Reinforcement Learning) provides a rich collection of datasets specifically designed for offline Safe Reinforcement Learning (RL). Created with the objective of fostering progress in offline safe RL research, DSRL bridges a crucial gap in the availability of safety-centric public benchmarks and datasets.

DSRL provides:

  1. Diverse datasets: 38 datasets across different safe RL environments and difficulty levels in SafetyGymnasium, BulletSafetyGym, and MetaDrive, all prepared with safety considerations.
  2. Consistent API with D4RL: For easy use and evaluation of offline learning methods.
  3. Data post-processing filters: Allowing alteration of data density, noise level, and reward distributions to simulate various data collection conditions.

This package is a part of a comprehensive benchmarking suite that includes FSRL and OSRL and aims to promote advancements in the development and evaluation of safe learning algorithms.

We provided a detailed breakdown of the datasets, including all the environments we use, the dataset sizes, and the cost-reward-return plot for each dataset. These details can be found in the docs folder.

To learn more, please visit our project website. If you find this code useful, please cite:

@article{liu2023datasets,
  title={Datasets and Benchmarks for Offline Safe Reinforcement Learning},
  author={Liu, Zuxin and Guo, Zijian and Lin, Haohong and Yao, Yihang and Zhu, Jiacheng and Cen, Zhepeng and Hu, Hanjiang and Yu, Wenhao and Zhang, Tingnan and Tan, Jie and others},
  journal={arXiv preprint arXiv:2306.09303},
  year={2023}
}

Installation

Install from PyPI

DSRL is currently hosted on PyPI, you can simply install it by:

pip install dsrl

It will by default install bullet-safety-gym and safety-gymnasium environments automatically.

If you want to use the MetaDrive environment, please install it via:

pip install git+https://github.com/HenryLHH/metadrive_clean.git@main

Install from source

Pull this repo and install:

git clone https://github.com/liuzuxin/DSRL.git
cd DSRL
pip install -e .

You can also install the MetaDrive package simply by specify the option:

pip install -e .[metadrive]

How to use DSRL

DSRL uses the Gymnasium API. Tasks are created via the gymnasium.make function. Each task is associated with a fixed offline dataset, which can be obtained with the env.get_dataset() method. This method returns a dictionary with:

  • observations: An N × obs_dim array of observations.
  • next_observations: An N × obs_dim of next observations.
  • actions: An N × act_dim array of actions.
  • rewards: An N dimensional array of rewards.
  • costs: An N dimensional array of costs.
  • terminals: An N dimensional array of episode termination flags. This is true when episodes end due to termination conditions such as falling over.
  • timeouts: An N dimensional array of termination flags. This is true when episodes end due to reaching the maximum episode length.

The usage is similar to D4RL. Here is an example code:

import gymnasium as gym
import dsrl

# Create the environment
env = gym.make('OfflineCarCircle-v0')

# Each task is associated with a dataset
# dataset contains observations, next_observatiosn, actions, rewards, costs, terminals, timeouts
dataset = env.get_dataset()
print(dataset['observations']) # An N x obs_dim Numpy array of observations

# dsrl abides by the OpenAI gym interface
obs, info = env.reset()
obs, reward, terminal, timeout, info = env.step(env.action_space.sample())
cost = info["cost"]

# Apply dataset filters [optional]
# dataset = env.pre_process_data(dataset, filter_cfgs)

Datasets are automatically downloaded to the ~/.dsrl/datasets directory when get_dataset() is called. If you would like to change the location of this directory, you can set the $DSRL_DATASET_DIR environment variable to the directory of your choosing, or pass in the dataset filepath directly into the get_dataset method.

You can use run the following example scripts to play with the offline dataset of all the supported environments:

python examples/run_mujoco.py --agent [your_agent] --task [your_task]
python examples/run_bullet.py --agent [your_agent] --task [your_task]
python examples/run_metadrive.py --road [your_road] --traffic [your_traffic] 

Normalizing Scores

  • Set target cost by using env.set_target_cost(target_cost) function, where target_cost is the undiscounted sum of costs of an episode
  • You can use the env.get_normalized_score(return, cost_return) function to compute a normalized reward and cost for an episode, where returns and cost_returns are the undiscounted sum of rewards and costs respectively of an episode.
  • The individual min and max reference returns are stored in dsrl/infos.py for reference.

License

All datasets are licensed under the Creative Commons Attribution 4.0 License (CC BY), and code is licensed under the Apache 2.0 License.