S20-team3-project

Requirements:

python 3.7
virtualenv

And all their dependencies. These are available in the requirements.txt

Automatic setup

A bash file, setup.sh, has been provided to setup the virtual environment and edit the Galaga gym-retro environment for you. To use this, ensure you have virtualenv and python3.7 installed, then simply run the script:

./setup.sh

Then, you must activate the virtual environment:

source venv/bin/activate

Once the virtual environment is active, if you would like to see a short demo of each network, run the demo.py file:

./demo/demo.py

Or to experience training, step into either SingleQ/ or DoubleQ/, and further into either Uniform/ or Prioritized/ and use:

./Galaga.py

To watch the gameplay, simply use:

./Galaga.py --play

Manual Setup

Create a 3.7 virtualenv

virtualenv --python=/path/to/python3.7 venv # /usr/bin/python3.7 for example

start the virtual environment- ensure that you are in your local repository

source venv/bin/activate

install requirements:

pip3 install -r requirements.txt

edit the documents noted within setup.sh to meet the environment requirements.

Finally, import the rom:

python3 -m retro.import 'Galaga - Demons of Death (USA).nes'

Action Space

As designated by Arcade Learning Environment Technical Manual[^1], we have selected six possible actions to control the game:

Do-Nothing (0)
Left (3)
Right (6)
Fire (9)
Left-Fire (12)
Right-Fire (15)

This exists as our "small action space", where we restrict the network's available actions to prevent it from "gaming" the reward system or getting stuck. However, the default action space for Galaga is also the same as this small action space, simply repeating as per the atari-py implementation manual[^1].

Implementations

Our full plan is to implement four different versions of the Deep Q-Learning architecture. Utilizing research on the Memory Replay component, we intend to compare the benefit of the Prioritized Memory Replay[^2] to that of the Uniform Memory Replay that we have interpreted as the norm of Deep Q-Learning.

Further, after completing these original two implementations, we plan to implement both Memory Replay methodologies within a Double Q-Learning implementation[^3]. This will allow us to compare the viability of the Prioritized Memory Replay within both environments, and compare the viability of the Double Q-Learning implementation.

Parameters and training regiments will be standardized across all implementations to ensure reproducability and comparability.

References

1: https://github.com/openai/atari-py/blob/master/doc/manual/manual.pdf

2: https://arxiv.org/abs/1511.05952

3: https://arxiv.org/abs/1509.06461f

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
DoubleQ		DoubleQ
Paper		Paper
SingleQ		SingleQ
demo		demo
logs		logs
.gitignore		.gitignore
Galaga - Demons of Death (USA).nes		Galaga - Demons of Death (USA).nes
HyperParameters.py		HyperParameters.py
LICENSE		LICENSE
Presentation.pdf		Presentation.pdf
Presentation.pptx		Presentation.pptx
ProjectMilestones.ipynb		ProjectMilestones.ipynb
ProjectProposal.ipynb		ProjectProposal.ipynb
README.md		README.md
requirements.txt		requirements.txt
setup.sh		setup.sh
utils.py		utils.py

License

CSCI4850/S20-team3-project

Folders and files

Latest commit

History

Repository files navigation

S20-team3-project

Automatic setup

Manual Setup

Action Space

Implementations

References

About

Resources

License

Stars

Watchers

Forks

Languages