TensorFlow implementations of different reinforcement learning algorithms.
List of implemented algorithms:
The package can be install by typing:
git clone https://github.com/asprenger/rl-experiments.git
cd rl-experiments
pip install -e .
This is an example of an agent playing Atari Space Invaders trained with PPO. The agent is quite efficient in killing the aliens and also does a good job in evading the bombs. Nevertheless it still fails consistently in the 5. round because the has not yet learned to first kill the aliens in the lowermost row and the games terminates because the aliens touch the ground. The agent probably needs more training time.
Here is an example of the Roboschool Hopper trained with PPO. It is interesting that it has learned to create additional forward momentum by dashing the upper limb forward at the right time.
Here is an example of the Roboschool Walker2d also trained on PPO. The movement is pretty clumsy and the walker falls over from time to time.