ddpg-aigym

Deep Deterministic Policy Gradient

Implementation of Deep Deterministic Policy Gradiet Algorithm (Lillicrap et al.arXiv:1509.02971.) in Tensorflow

git clone https://github.com/stevenpjg/ddpg-aigym.git
cd ddpg-aigym
python main.py

The learning curve for InvertedPendulum-v1 environment.

Tensorflow (Developed in tensorflow version 0.11.0rc0 [CPU version] [GPU version])
OpenAi gym
Mujoco

To use different environment

experiment= 'InvertedPendulum-v1' #specify environments here

To use batch normalization

is_batch_norm = True #batch normalization switch

Let me know if there are any issues and clarifications regarding hyperparameter tuning.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
batch_normalization		batch_normalization
model		model
tf.gradients_eg		tf.gradients_eg
y_hat_log/2018-10-20_17-56-36		y_hat_log/2018-10-20_17-56-36
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
actor_net.py		actor_net.py
actor_net_bn.py		actor_net_bn.py
critic_net.py		critic_net.py
critic_net_bn.py		critic_net_bn.py
ddpg.py		ddpg.py
gp.py		gp.py
learning_curve.png		learning_curve.png
main.py		main.py
ou_noise.py		ou_noise.py
plot.py		plot.py
result_plot.py		result_plot.py
system.py		system.py
tensorflow_grad_inverter.py		tensorflow_grad_inverter.py
tensorflow_session.py		tensorflow_session.py