FlappyAI

Q-Learning with a Flappy Bird simulator.

Why?

FlappyAI is an AI that uses simple Q-Learning trained on a custom Flappy Bird simulator. It is mainly created for learning purposes. If you are looking for Q-Learning libraries that are more efficient and does not mind the steep learning curve, try TensorFlow instead.

Structure

FlappyAI contains these main parts:

FlappyGame - FlappyBird simulator
Trainer - Q-Learning agent
ModelInterface - Interface for the model used by the agent
main.py - Part where everything connects together

Dependencies

FlappyAI is written in Python. The required libraries include:

pygame - Required for the graphics
Standard Python2 libraries

Extending FlappyAI to Other Games

Simply create a new interface and inherits from ModelInterface for your simulator.

from qlearning import Trainer, ModelInterface

class MyInterface(ModelInterface):
    # interacts with your simulator

agent = Trainer(MyInterface())  # creates a new agent with your interface
agent.train()                   # starts the training

How to Run

Type

python main.py -h

for help. FlappyAI has three modes:

interactive - No AI involved; Only human input
train - Start the training process.
test - Let AI plays the game

FlappyAI automatically looks for qtable.p in the current directory. It will create a new one if it cannot find one. This file is used to store the trained Q-table (KNAWLEDGE!). The Q-table file included in the repository is already trained, but if you want to retrain the agent, simply remove or rename the file.

During the training mode, press Ctrl+C to stop and store the Q-table to the file.

Technical Details

State Space

Assuming the game size is 640x480. Parameters considered by the AI are the following:

Horizontal distance between bird and nearest pipe divided by 10 (i.e. 0 to 300/10 and 300+)
Vertical distance between bird and nearest pipe (-480/10 to 480/10)
Bird velocity (-10/5 to 20/5)

Resulting in a total of 43456 states.

The 2D space is discretized into 10x10 tiles, therefore the framerate of the AI must be reduced by a factor of 10 (i.e. game is running in 60fps and the agent will be running in 6fps.)

Q-Learning Background

Q-Learning is a simple reinforcement learning algorithm that has three parameters:

α - Learning rate
γ - Discount factor
ε - Exploration probability (only in ε-greddy Q-Learning)

Definitions:

Let S be the set of all states and A be the set of all actions.
We also define Q: S x A -> R, where R is the set of all real numbers.
Then the algorithm is as follow:
- Q(s_t, a_t) <- Q(s_t, a_t) + α (r_t + γ maxOverAllActions(Q(s_(t+1), a)) - Q(s_t, a_t))
- Continues until it converges (or converges within epsilon.)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
demo.gif		demo.gif
events.py		events.py
flappy.py		flappy.py
main.py		main.py
qlearning.py		qlearning.py
qtable.p		qtable.p

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitattributes

.gitattributes

.gitignore

.gitignore

README.md

README.md

demo.gif

demo.gif

events.py

events.py

flappy.py

flappy.py

main.py

main.py

qlearning.py

qlearning.py

qtable.p

qtable.p

Repository files navigation

FlappyAI

Why?

Structure

Dependencies

Extending FlappyAI to Other Games

How to Run

Technical Details

State Space

Q-Learning Background

About

Releases

Packages

Languages

derek1906/FlappyAI

Folders and files

Latest commit

History

Repository files navigation

FlappyAI

Why?

Structure

Dependencies

Extending FlappyAI to Other Games

How to Run

Technical Details

State Space

Q-Learning Background

About

Resources

Stars

Watchers

Forks

Languages