DDPG

Continuous control with deep reinforcement learning

The goal of these algorithms is to perform policy iteration by alternatively performing policy evaluation on the current policy with Q-learning, and then improving upon the current policy by following the policy gradient.

TODO

Batch Normalization
Prioritized Experience Replay (https://arxiv.org/abs/1511.05952): to replay important transitions from reply Memory more frequently

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
3D33.png		3D33.png
3D34.png		3D34.png
3D35.png		3D35.png
Critic_network.py		Critic_network.py
DDPG.py		DDPG.py
README.md		README.md
actor_netwok.py		actor_netwok.py
episode2D.png		episode2D.png
filter.py		filter.py
replay.py		replay.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3D33.png

3D33.png

3D34.png

3D34.png

3D35.png

3D35.png

Critic_network.py

Critic_network.py

DDPG.py

DDPG.py

README.md

README.md

actor_netwok.py

actor_netwok.py

episode2D.png

episode2D.png

filter.py

filter.py

replay.py

replay.py

run.py

run.py

Repository files navigation

DDPG

TODO

Reference

About

Releases

Packages

Languages

yusme/DDPG

Folders and files

Latest commit

History

Repository files navigation

DDPG

TODO

Reference

About

Resources

Stars

Watchers

Forks

Languages