Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization (NeurIPS 2023)

The official implementation of "Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization". OMIGA provides a principled framework to convert global-level value regularization into equivalent implicit local value regularizations and simultaneously enables in-sample learning, thus elegantly bridging multi-agent value decomposition and policy learning with offline regularizations. This repository is inspired by the TRPO-in-MARL library for online Multi-Agent RL.

This repo provides the implementation of OMIGA in Multi-agent MuJoCo.

Installation

conda create -n env_name python=3.9
conda activate OMIGA
git clone https://github.com/ZhengYinan-AIR/OMIGA.git
cd OMIGA
pip install -r requirements.txt

How to run

Before running the code, you need to download the necessary offline datasets (Download link). Then, make sure the config file at configs/config.py is correct. Set the data_dir parameter as the storage location for the downloaded data, and configure parameters scenario, agent_conf, and data_type. You can run the code as follows:

# If the location of the dataset is at: "/data/Ant-v2-2x4-expert.hdf5"
cd OMIGA
python run_mujoco.py --data_dir="/data/" --scenario="Ant-v2" --agent_conf="2x4" --data_type="expert"

Weights and Biases Online Visualization Integration

This codebase can also log to W&B online visualization platform. To log to W&B, you first need to set your W&B API key environment variable:

wandb online
export WANDB_API_KEY='YOUR W&B API KEY HERE'

Then you can run experiments with W&B logging turned on:

python run_mujoco.py --wandb=True

Bibtex

If you find our code and paper can help, please cite our paper as:

@article{wang2023offline,
  title={Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization},
  author={Wang, Xiangsen and Xu, Haoran and Zheng, Yinan and Zhan, Xianyuan},
  journal={Advances in Neural Information Processing Systems},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
algos		algos
configs		configs
datasets		datasets
envs		envs
networks		networks
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run_mujoco.py		run_mujoco.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algos

algos

configs

configs

datasets

datasets

envs

envs

networks

networks

utils

utils

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

run_mujoco.py

run_mujoco.py

Repository files navigation

Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization (NeurIPS 2023)

Installation

How to run

Weights and Biases Online Visualization Integration

Bibtex

About

Releases

Packages

Languages

AIR-DI/OMIGA

Folders and files

Latest commit

History

Repository files navigation

Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization (NeurIPS 2023)

Installation

How to run

Weights and Biases Online Visualization Integration

Bibtex

About

Resources

Stars

Watchers

Forks

Languages