Skip to content

Latest commit

 

History

History
37 lines (21 loc) · 1.2 KB

README.md

File metadata and controls

37 lines (21 loc) · 1.2 KB

DDPG

Continuous control with deep reinforcement learning

http://arxiv.org/abs/1509.02971

The goal of these algorithms is to perform policy iteration by alternatively performing policy evaluation on the current policy with Q-learning, and then improving upon the current policy by following the policy gradient.

TODO

Reference