Skip to content

FanchenBao/reinforcement_learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

This repo contains three projects from the class CAP6629 Rinforcement Learning, offered by Florida Atlantic University, and taught by Dr. Zhen Ni. Each project covers a crucial part of my journey in reinforcement learning. The goal of this repo is to document what I have learned and hopefully also help others if they get stuck early on in their journey through reinforcement learning.

Repo Layout

Below I will briefly describe the contents of the repo and each project folder. Each project folder has similar structure: a notebook containing all the code for the project, a figures folder containing plots made from the notebook, and a report summarizing the objective, finding, and discussion of the project.

The code in the notebook is heavily documented. So hopefully no additional instruction is needed to get one started. Users are also encouraged to skim through the report in each project folder to get more context of the code in the notebook.

Project Overview

Project 1: Multi-arm bandit

It contains analysis of how different $\epsilon$ values in the $\epsilon$-greedy method changes the expected return.

Project 2: Tabular Method

This is a very ambitious endeavor to code all tabular methods covered in the lecture and analyze their performance on solving a simple grid world problem. Users are strongly encouraged to cross reference with the associated report to learn the algorithm of each tabular method.

Project 3: Actor-Critic Architecture

The Actor-Critic architecture is implemented to solve a grid world problem, the same one as in Project 2. However, unlike the tabular method, it is based on training neural network models to simulate the policy and state value function. Implementation borrows from the documentation on Keras and TensorFlow, but with my own twist.

There is also a presentation on the Actor-Critic Architecture for a quick overview of the project result.

Playground

Try the notebook on Binder. Note that it takes quite a while for Binder to set up the environment for the first time. Please be patient. Subsequent runs are much faster.

  • Project 1: Binder
  • Project 2: Binder
  • Project 3: Binder

Releases

No releases published

Packages

No packages published