Skip to content

laroccacharly/reinforcement_learning_adventure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A reinforcement learning adventure

This is a compilation of reinforcement learning algorithms.

** There are no shortage of RL repos on github (see the Related works section). If you are looking for state-of-the-art implementations I recommend you look into these. **

Goals of this project

Learn about RL by implementing algorithms by myself while using good practices (OOP, TDD)

I noticed how most repos lack a test suite and good abstractions.

Related works

List of algorithms

  • Action-value method (to solve the k-armed Bandit problem)
  • Q-Learning (using a linear model and a deep neural network)
  • Monte Carlo Control
  • Actor-Critic with eligibility traces
  • DDPG
  • PPO (WIP)

Hyperparameters tunning

This repo also includes an abstraction (HyperFitter) that allows a Grid or Random search on the hyperparameter space of the agent. I was inspired by HyperOpt and sklearn GridSearchCV. I had to make my own because they are not flexible enough for my needs.

Dependencies

  • PyTorch (for neural networks)
  • TensorFlow
  • OpenAI Gym (for the environments)
  • pytest (for testing)
  • sklearn (for feature mapping)
  • Matplotlib (for plotting)
  • NumPy (for everything else)

Testing

Testing is an interesting issue in RL. It is not clear what assertions should be made about an agent. We know that it should learn, but it is hard to predict what the learning curve should look like. See this great talk by Dr. Pineau : https://www.youtube.com/watch?v=-0G98MYUtjI

This repo mainly use integration testing (testing the general behavior instead of every method of every class). To do so, tests check if the return of the agent (the metric we try to maximize) is greater then the return of an agent that takes random decisions.

PYTHONPATH=. pytest 

You can add the flag --pdb to debug