An implementation of MADDPG

1. Introduction

This is a pytorch implementation of multi-agent deep deterministic policy gradient algorithm.

The experimental environment is a modified version of Waterworld based on MADRL.

2. Environment

The main features (different from MADRL) of the modified Waterworld environment are:

evaders and poisons now bounce at the wall obeying physical rules
sizes of the evaders, pursuers and poisons are now the same so that random actions will lead to average rewards around 0.
need exactly n_coop agents to catch food.

3. Dependency

pytorch
visdom
python==3.6.1 (recommend using the anaconda/miniconda)
if you need to render the environments, opencv is required

4. Install

Install MADRL.
Replace the madrl_environments/pursuit directory with the one in this repo.
python main.py

if scene rendering is enabled, recommend to install opencv through conda-forge.

5. Results

two agents, cooperation = 2

The two agents need to cooperate to achieve the food for reward 10.

the average

one agent, cooperation = 1

6. TODO

reproduce the experiments in the paper with competitive environments.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
PNG		PNG
pursuit		pursuit
MADDPG.py		MADDPG.py
README.org		README.org
main.py		main.py
memory.py		memory.py
model.py		model.py
params.py		params.py
randomProcess.py		randomProcess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PNG

PNG

pursuit

pursuit

MADDPG.py

MADDPG.py

README.org

README.org

main.py

main.py

memory.py

memory.py

model.py

model.py

params.py

params.py

randomProcess.py

randomProcess.py

Repository files navigation

An implementation of MADDPG

1. Introduction

2. Environment

3. Dependency

4. Install

5. Results

two agents, cooperation = 2

one agent, cooperation = 1

6. TODO

About

Releases

Packages

Languages

xuehy/pytorch-maddpg

Folders and files

Latest commit

History

Repository files navigation

An implementation of MADDPG

1. Introduction

2. Environment

3. Dependency

4. Install

5. Results

two agents, cooperation = 2

one agent, cooperation = 1

6. TODO

About

Topics

Resources

Stars

Watchers

Forks

Languages