RL A3C Pytorch

This repository includes my implementation of Asynchronous Advantage Actor-Critic (A3C) in Pytorch an algorithm from Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning".

Requirements

Python 3+
OpenAI Gym
Pytorch

A3C GRU

I implemented an A3C model, using GRU's rather than LSTM's, and trained it on two atari 2600 environments, that are PongDeterministic-v4 and BreakoutDeterministic-v4 provided in the Openai Gym. So far my model currently has completed the game of Pong and has an average score of 329.5 in Breakout. Saved models in Pong-v4 and Breakout-v4 folder. Trained models may not run properly if you have older version gym and v3 atari. To make sure they run properly u need to keep gym version <= 0.9.1 and atari-py version <= 0.1.1.

You can use RMSprop and Adam for sharing statistics between the networks.

Adam optimizer was used for sharing weights in the saved .pkl files.

Training

Limit number of worker threads to number of cpu cores available as too many threads (e.g. more than one thread per cpu core available) will actually result in decrease of training speed and effectiveness.

To train agent in PongDeterministic-v4 environment with 4 different worker threads:

python main.py --env-name PongDeterministic-v4 --num-processes 4

or to train agent in BreakoutDeterministic-v4 environment:

python main.py --env-name BreakoutDeterministic-v4 --num-processes 4

Pong will approximately takes 40 minutes to finish if 4 worker threads are used.

Breakout, for me, took more than 10 hours to reach a score of 300 with 4 workers and less than 5 hours using 8 worker threads.

The test.py file will save weights with score more than or equal to 300 while the training will still continue.

Test Run on Gym

To run a 50 episode gym evaluation with trained model

python gym_test.py --env-name PongDeterministic-v4 --num-episodes 50

or

python gym_test.py --env-name BreakoutDeterministic-v4 --num-episodes 50

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Breakout-v4		Breakout-v4
Pong-v4		Pong-v4
.gitignore		.gitignore
A3C_model.py		A3C_model.py
README.md		README.md
environment.py		environment.py
gym_test.py		gym_test.py
main.py		main.py
shared_optim.py		shared_optim.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breakout-v4

Breakout-v4

Pong-v4

Pong-v4

.gitignore

.gitignore

A3C_model.py

A3C_model.py

README.md

README.md

environment.py

environment.py

gym_test.py

gym_test.py

main.py

main.py

shared_optim.py

shared_optim.py

test.py

test.py

train.py

train.py

Repository files navigation

RL A3C Pytorch

Requirements

A3C GRU

Training

Test Run on Gym

References

About

Releases

Packages

Languages

pranz24/A3C-GRU

Folders and files

Latest commit

History

Repository files navigation

RL A3C Pytorch

Requirements

A3C GRU

Training

Test Run on Gym

References

About

Topics

Resources

Stars

Watchers

Forks

Languages