Companion code for https://confengine.com/odsc-india-2019/proposal/10044/algorithms-that-learn-to-solve-tasks-by-watching-one-youtube-video
It shows how to apply a Reinforcement Learning algorithm called Policy Gradients (Many of the recent successes in RL use this is some form) to a toy problem - where you attempt to land a small spaceship on the moon. It is built using pytorch.
One of the most frequent questions that was asked was: How to train Deep Reinforcement Learning algorithms if you are starting from scratch? Here is a list of (zero to one) resources:
- Reinforcement Learning Lectures by David Silver (https://www.youtube.com/playlist?list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT)
- The Nature paper on Deep-Q-Learning (https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf)
- The blog post on Policy Gradients by Andrej Karpathy (http://karpathy.github.io/2016/05/31/rl/)
- The paper on Deep Deterministic Policy Gradients (https://arxiv.org/abs/1509.02971)