SAIDA RL

Welcome to SAIDA RL! This is the open-source platform for anyone who is interested in Starcraft I and reinforcement learning to play and evaluate your model and algorithms.

What is SAIDA RL?

It is a simulator for users to train and evaluate their own algorithms and models in challenging Starcraft I environment. It is not only provide simulator itself, but also provide tutorials, development api document. It is specialized in Starcraft I and provides many scenarios. You can taste very fresh and challenging game and you can try your own idea to make starcraft unit better.

Basics

Anyone who are familiar with reinforcement learning entities and how open ai gym works can skip this. In reinforcement learning, there are the environment and the agent. The agent sends actions to the environment, and the environment replies with observations and rewards.

SAIDA RL inherits interface of Env class of gym and provides baseline algorithms and agent source which is independent from Env. But it is up to you whether to use these baseline sources. The following are the Env methods you should know:

function	explanation
init(self)	Not used.
render(self)	Not used.
reset(self)	Reset the environment's state. Returns observation.
step(self, action)	Step the environment by one timestep. Returns observation, reward, done, info.
close(self)	Close connection with Starcraft.

Guide

You can access three documentations to run a agent in this environment.

Environment

We built environment based on openai gym. it consists of interface like below.

Agent

Agent we provide is based on keras-rl which is one of top reinforcement learning framework commonly used and we upgraded it by oursevles to support more. But you can use your own agent if you want. We decoupled between agent and environment. there is no dependencies so that, it is compatible with most numerical computation library, such as TensorFlow or Theano. You can use it from Python code, and soon from other languages. If you're not sure where to start, we recommend beginning with the tutorials on our site.

Scenarios

We have various challenging scenarios for you to motivate trying to solve with reinforcement learning algorithms.

Map Name	Env Name	Desc.	Terrain(Y/N)	Agent	Action space	Termination Condition
Vul_VS_Zeal_v0(~3)	VultureVsZealot	Agent(Terran Vulture) should kill all Protoss Zealots while being damaged minimally. The number of zealots and existence of terrain depend on the version of map.	It depends on the version of map.	Vulture	Move to specific direction, Patrol to enemy(meaning attack)	kill all Zealots or defeated
Avoid_Observer_v0	AvoidObserver	Reach the top of map while avoiding observers in the middle area.	N	Scourge	Move to specific direction	Reach the goal or bumped with observers
Avoid_Reaver_v0	AvoidReaver	Reach the right-bottom area of the map while avoiding reavers in the middle area.	N	DropShip	Move to specific direction	Reach the goal

Algorithms

we divided algorithms to three categories.

Value based

Before DQN

QLearning
SARSA

DQN with variants

DQN [1], [2]
Dueling DQN [9]
Double DQN [3]

Deep Recurrent DQN

DRQN [11]

Policy based

Multi Agent algorithms

Working Examples

Demos for well trained agents' play

Grid World in Starcraft I

for warming up, you can try this problem by yourselves.

Avoid Observers

Scurege's goal is to reach top area of current map avoiding conflict with observers surrounded

Avoid Reavers

Reaver's goal is to reach bottom area of current map avoiding conflict with drop ship surrounded

Vultures 1 vs Zealot 1

Battle between one vulture and one zealot.

Vultures 1 vs Zealot 2

Battle between one vulture and two zealot.

Plan

We will update more challenging scenarios.
Multi Agent algorithms

References

Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
Human-level control through deep reinforcement learning, Mnih et al., 2015
Deep Reinforcement Learning with Double Q-learning, van Hasselt et al., 2015
Continuous control with deep reinforcement learning, Lillicrap et al., 2015
Asynchronous Methods for Deep Reinforcement Learning, Mnih et al., 2016
Continuous Deep Q-Learning with Model-based Acceleration, Gu et al., 2016
Deep Reinforcement Learning (MLSS lecture notes), Schulman, 2016
Dueling Network Architectures for Deep Reinforcement Learning, Wang et al., 2016
Reinforcement learning: An introduction, Sutton and Barto, 2011
Proximal Policy Optimization Algorithms, Schulman et al., 2017
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, 2015
Multi-agent actor-critic for mixed cooperative-competitive environments, Lowe, Ryan, et al., 2017
Multiagent Bidirectionally-Coordinated Nets Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games, Peng et al., 2017
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning, William et al., 1992

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
cpp		cpp
install		install
materials		materials
protobuf		protobuf
python		python
study		study
.gitignore		.gitignore
License.txt		License.txt
README.md		README.md

License

BupyeongHealer/SAMSUNG-Reinforcement-Learning-Custom

Folders and files

Latest commit

History

Repository files navigation

SAIDA RL

Table of Contents