RL-Algorithms

Resources used are listen in each ipynb file

The ES + A2C shows early convergence and also more stability over episodes.

The ES algorithm used is from Evolution-Guided Policy Gradient in Reinforcement Learning - https://arxiv.org/abs/1805.07917

Implementating preliminary RL Algoirthms:

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
DQN_cartpole.ipynb		DQN_cartpole.ipynb
ES_a2c_algorithm.ipynb		ES_a2c_algorithm.ipynb
README.md		README.md

Provide feedback