README

This is a reinforcement learning algorithm library. The code takes into account both performance and simplicity, with little dependence. The algorithm code comes from spinningup and some researchers engaged in reinforcement learning.

Algorithms

The project covers the following algorithms：

DQN, Dueling DQN, D3QN
DDPG, DDPG-HER, DDPG-PER
PPO+GAE, Multi-Processing PPO, Discrete PPO
TD3, Multi-Processing TD3
SAC
MADDPG

All the algorithms adopt the pytorch framework. All the codes are combined in the easiest way to understand, which is suitable for beginners of reinforcement learning, but the code performance is excellent.

Reference

This project also provides the reference of these algorithms:

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
CONTINUOUS CONTROL WITH DEEP REINFORCEMENT
High-Dimensional Continuous Control Using Generalized Advantage Estimation
Proximal Policy Optimization
Soft Actor-Critic Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Auto alpha Soft Actor-Critic Algorithms and Applications
Addressing Function Approximation Error in Actor-Critic Methods

Results

Testing environment: 'Pendulum-v0'. What you just need to do is running the main.py. Here are the results of several cases:

spinningup-DDPG reward curve:

spinningup-TD3 reward curve:

spinningup-SAC reward curve:

Project tree

DRL_algorithm_library
├─ Arm_env.py
├─ DDPG
│    ├─ DDPG
│    ├─ DDPG_spinningup
│    └─ DDPG_spinningup_PER
├─ DQN_family
│    ├─ Agent.py
│    ├─ __pycache__
│    ├─ core.py
│    ├─ main.py
│    └─ runs
├─ MADDPG
│    ├─ .gitignore
│    ├─ .idea
│    ├─ __pycache__
│    ├─ arguments.py
│    ├─ enjoy_split.py
│    ├─ logs
│    ├─ main_openai.py
│    ├─ model.py
│    ├─ models
│    └─ replay_buffer.py
├─ PPO
│    ├─ .idea
│    ├─ DiscretePPO
│    ├─ PPOModel.py
│    ├─ TrainedModel
│    ├─ __pycache__
│    ├─ core.py
│    ├─ draw.py
│    ├─ multi_processing_ppo
│    └─ myPPO.py
├─ README.md
├─ SAC
│    ├─ SAC_demo1
│    └─ SAC_spinningup
├─ TD3
│    ├─ .idea
│    ├─ Multi-Processing-TD3
│    ├─ TD3
│    ├─ TD3_spinningup
│    └─ TrainedModel
├─ __pycache__
│    ├─ Arm_env.cpython-36.pyc
│    └─ Arm_env.cpython-38.pyc
├─ imgs
│    ├─ spinSAC.png
│    ├─ spin_ddpg.png
│    └─ spin_td3.png
└─ reference
       ├─ 多智能体 MADDPG - Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments - 1706.02275.pdf
       ├─ 强化学习 DDPG - CONTINUOUS CONTROL WITH DEEP REINFORCEMENT 1509.02971.pdf
       ├─ 强化学习 GAE High-Dimensional Continuous Control Using Generalized Advantage Estimation 1506.02438.pdf
       ├─ 强化学习 PPO - Proximal Policy Optimization1707.06347.pdf
       ├─ 强化学习 SAC1 - Soft Actor-Critic Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor - 1801.01290.pdf
       ├─ 强化学习 SAC2 auto alpha  Soft Actor-Critic Algorithms and Applications 1812.05905.pdf
       └─ 强化学习 TD3 - Addressing Function Approximation Error in Actor-Critic Methods 1802.09477.pdf

Requirements

gym==0.10.5

matplotlib==3.2.2

pytorch==1.7.1

numpy==1.19.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

DDPG

DDPG

DQN_family

DQN_family

MADDPG

MADDPG

PPO

PPO

SAC

SAC

TD3

TD3

pycache

pycache

imgs

imgs

reference

reference

Arm_env.py

Arm_env.py

README.md

README.md

Repository files navigation

README

Algorithms

Reference

Results

Project tree

Requirements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
DDPG		DDPG
DQN_family		DQN_family
MADDPG		MADDPG
PPO		PPO
SAC		SAC
TD3		TD3
__pycache__		__pycache__
imgs		imgs
reference		reference
Arm_env.py		Arm_env.py
README.md		README.md

ZYunfeii/DRL_algorithm_library

Folders and files

Latest commit

History

Repository files navigation

README

Algorithms

Reference

Results

Project tree

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Languages