Comparing Exploitation-Based and Game Theory Optimal Based Approaches in a Multi-Agent Environment

In this project, we compared two algorithms

BPR: exploitative style, a way of playing to identify and exploit imbalances in the strategies of your opponents.
MADDPG/M3DDPG: game theory optimal (GTO) style, a way of playing a game that makes you unexploitable to your opponents.

Check the report for more detail.

Remarks

env.py is the environment we developed to test the algorithms. You can interact with the environment by running play_with_model.py.
train/ folder contains the code we used to train our agent.
- Notice that you may need to add sys.path.append to make import env works
For the MADDPG/M3DDPG agents, we stored them as pickle objects after training for reuse.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
agents		agents
templates		templates
test		test
train		train
.gitignore		.gitignore
README.md		README.md
baseline.py		baseline.py
env.py		env.py
play_with_model.py		play_with_model.py
soccer_stat.py		soccer_stat.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agents

agents

templates

templates

test

test

train

train

.gitignore

.gitignore

README.md

README.md

baseline.py

baseline.py

env.py

env.py

play_with_model.py

play_with_model.py

soccer_stat.py

soccer_stat.py

Repository files navigation

Comparing Exploitation-Based and Game Theory Optimal Based Approaches in a Multi-Agent Environment

Remarks

About

Contributors 2

Languages

jerry871002/bpr-gto-rl-comparison

Folders and files

Latest commit

History

Repository files navigation

Comparing Exploitation-Based and Game Theory Optimal Based Approaches in a Multi-Agent Environment

Remarks

About

Topics

Resources

Stars

Watchers

Forks

Languages