QLearningMaze

Goal: Reaching the yellow oval while avoiding black blocks and moving enemy (red block)

QLearningMaze

Implementation of Q-Learning usind TD error for optimally navigating a maze while avoiding a moving enemy.

To run:

$ pip install numpy pandas
$ python main.py

Project comes with trained Qtable in pickled file action You may run in the following ways

Importing Q-table and running optimal policy

$ python main.py

Training

$ python main.py --test

Training + GUI

(slow, mostly for debugging)

$ python main.py --test --vis

Algorithm used

Q-values are updated based on the following formula:

pseudo formula

newVal = oldVal + learningRate * (reward + discount_val * maxValOfNextState - oldVal)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitignore		.gitignore
README.md		README.md
actions		actions
main.py		main.py
maze_env.py		maze_env.py
qlearn2.gif		qlearn2.gif
rl_brain.py		rl_brain.py
visualiser.py		visualiser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

actions

actions

main.py

main.py

maze_env.py

maze_env.py

qlearn2.gif

qlearn2.gif

rl_brain.py

rl_brain.py

visualiser.py

visualiser.py

Repository files navigation

QLearningMaze

To run:

Importing Q-table and running optimal policy

Training

Training + GUI

Algorithm used

pseudo formula

About

Releases

Packages

Languages

PierpaoloLucarelli/QLearningMaze

Folders and files

Latest commit

History

Repository files navigation

QLearningMaze

To run:

Importing Q-table and running optimal policy

Training

Training + GUI

Algorithm used

pseudo formula

About

Topics

Resources

Stars

Watchers

Forks

Languages