Skip to content

hex-plex/GNN-MARL

Repository files navigation

Graph Attention For Communication in MARL

This is an implementation of Communication in MARL using Graph Neural Network. This is been trained and tested on StarCraft II, And this has shown improved training and performance metrics throughout all the maps. I have implemented this on top of PyMARL for easier comparative study with respect to other algorithms or implementations like ePyMARL.

Currently we have the following algorithms for training.

For communication we have used to different Architecures

Graph Neural Network.

More Information about the architecture and the execution can be found at MultiAgent GNN A brief outline would be as follows

Pipeline for communication using Graph Neural Network

The implementation is written in PyTorch and uses a modified version of SMAC which could be found in smac-py to include the adjacency matrix as the observation more detail on it can be found here.

For a glimpse of the algorithm in action checkout the Output section

Installation instructions

I have used the default installation given in PyMARL, along which I have added a few changes to work with the latest version of pytorch (i.e., 1.10.0 at the time of documentation.) and added the requirements for the pytorch_geometric

-In the PyMARL repo the version of Cuda and the required version of Pytorch is very old
+The whole codebase is shifted to the latest torch 1.10.0 and cuda 11.3
+Hence custom installation would be better
-Use of the current Docker file is depreciated

Build the Dockerfile using

cd docker
bash build.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

Alternatively

After downloading SC2 follow the following steps

pip install -r requirements.txt
pip install -e smac-py

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

Run an experiment

python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

All results will be stored in the Results folder.

The previous config files used for the SMAC Beta have the suffix _beta.

Adjacency Matrix

An adjacency matrix simply represents the vertices of the graph. For the current problem we have used a few heuristics for joining two nodes with a vertex. They are as below

  • Communication distance : - Even though their is no restriction in communication having local communication improves cooperation in shared tasks
  • Unit Type : - Many task benefit from similar units perfoming certain part of the task than other other units cooperating with each other.

Results

Below are the training and test metric of the presented algorithm with QMIX on map 2s3z. The study is limited to the number of experiments due to limitation in computation at the disposal. The presented algorithm does support parallel envs and boosting the process of training. This would be tested soon

Train --
Battle win percentage Average Return
Test --
Battle win percentage Average Return

Output

This is a demo output from the policy whose stats are given above

Weights and the logs can be found here [Drive].