Description

This repo contains three projects from the class CAP6629 Rinforcement Learning, offered by Florida Atlantic University, and taught by Dr. Zhen Ni. Each project covers a crucial part of my journey in reinforcement learning. The goal of this repo is to document what I have learned and hopefully also help others if they get stuck early on in their journey through reinforcement learning.

Repo Layout

Below I will briefly describe the contents of the repo and each project folder. Each project folder has similar structure: a notebook containing all the code for the project, a figures folder containing plots made from the notebook, and a report summarizing the objective, finding, and discussion of the project.

The code in the notebook is heavily documented. So hopefully no additional instruction is needed to get one started. Users are also encouraged to skim through the report in each project folder to get more context of the code in the notebook.

Project Overview

Project 1: Multi-arm bandit

It contains analysis of how different $\epsilon$ values in the $\epsilon$-greedy method changes the expected return.

Project 2: Tabular Method

This is a very ambitious endeavor to code all tabular methods covered in the lecture and analyze their performance on solving a simple grid world problem. Users are strongly encouraged to cross reference with the associated report to learn the algorithm of each tabular method.

Project 3: Actor-Critic Architecture

The Actor-Critic architecture is implemented to solve a grid world problem, the same one as in Project 2. However, unlike the tabular method, it is based on training neural network models to simulate the policy and state value function. Implementation borrows from the documentation on Keras and TensorFlow, but with my own twist.

There is also a presentation on the Actor-Critic Architecture for a quick overview of the project result.

Playground

Try the notebook on Binder. Note that it takes quite a while for Binder to set up the environment for the first time. Please be patient. Subsequent runs are much faster.

Project 1:
Project 2:
Project 3:

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
project_1		project_1
project_2		project_2
project_3		project_3
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

project_1

project_1

project_2

project_2

project_3

project_3

.gitignore

.gitignore

LICENSE

LICENSE

README.MD

README.MD

requirements.txt

requirements.txt

Repository files navigation

Description

Repo Layout

Project Overview

Project 1: Multi-arm bandit

Project 2: Tabular Method

Project 3: Actor-Critic Architecture

Playground

About

Releases

Packages

Languages

License

FanchenBao/reinforcement_learning

Folders and files

Latest commit

History

Repository files navigation

Description

Repo Layout

Project Overview

Project 1: Multi-arm bandit

Project 2: Tabular Method

Project 3: Actor-Critic Architecture

Playground

About

Topics

Resources

License

Stars

Watchers

Forks

Languages