Skip to content

This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)

License

tondji/reinforcement-learning-algorithms

Repository files navigation

Deep Reinforcement Learning Alogrithms

MIT License
demo
This repository will implement the classic deep reinforcement learning algorithms. The aim of this repository is to provide clear code for people to learn the deep reinforcement learning algorithm. In the future, more algorithms will be added and the existing codes will also be maintained.

Update Information

2018-10-17 - In this update, most of algorithms have been imporved and add more experiments with plots (except for DPPG). The PPO now supports atari-games and mujoco-env. The TRPO is much stable and can have better results!

TODO List

  • add prioritized experience replay.
  • in the future, we will not use openai baseline's pre-processing functions.
  • improve the DDPG.

Requirements

Installation

  1. install the pytorch
plase go to official webisite to install it: https://pytorch.org/

Recommend use Anaconda Virtual Environment to manage your packages
  1. install openai-baselines (the openai-baselines update so quickly, please use the older version as blow, will solve in the future.)
# clone the openai baselines
git clone https://github.com/openai/baselines.git
cd baselines
git checkout 366f486
pip install -e .

Instructions

  1. select the suitable algorithms
cd <the-rl-algorithm>
  1. all of the parameters are defined in the arguments.py, you can train your model with suitable hyper-parameters.
  2. train the networks
python train_network.py --env-name=<env-name> --cuda (only TRPO not support GPU) --<other-flags>
  1. test the networks
python demo.py --env-name=<env-name>
  1. download the pre-trained models
    Please download them from the Google Driver, then put the saved_models under the corresponding algorithm's folder.

Performance of the algorithms

Deep Q Network (DQN)

dqn_performance

Double DQN

ddqn_performance

Dueling Network

dueling_network

Advantage Actor Critic (A2C)

a2c

Trust Region Policy Optimization (TRPO)

trpo

Proximal Policy Optimization (PPO)

ppo

Acknowledgement:

Papers Related to the Deep Reinforcement Learning

[1] A Brief Survey of Deep Reinforcement Learning
[2] The Beta Policy for Continuous Control Reinforcement Learning
[3] Playing Atari with Deep Reinforcement Learning
[4] Deep Reinforcement Learning with Double Q-learning
[5] Dueling Network Architectures for Deep Reinforcement Learning
[6] Continuous control with deep reinforcement learning
[7] Continuous Deep Q-Learning with Model-based Acceleration
[8] Asynchronous Methods for Deep Reinforcement Learning
[9] Trust Region Policy Optimization
[10] Proximal Policy Optimization Algorithms
[11] Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

About

This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages