Modified from the work of Patrick Emami: Deep Deterministic Policy Gradients in TensorFlow
Algorithm and hyperparameter details can be found here: "Continuous control with deep reinforcement learning" - TP Lillicrap, JJ Hunt et al., 2015
Gym and TensorFlow.
- Removed TFLearn dependency
- Added Ornstein Uhlenbeck noise function
- Added reward discounting
- Works with discrete and continuous action spaces