Skip to content

FarnazAdib/Crash_course_on_RL

Repository files navigation

A Crash Course on Reinforcement Learning for Control Problems Using TensorFlow 2

This is a self-contained repository to explain two basic Reinforcement (RL) algorithms, namely Policy Gradient (PG) and Q-learning, and show how to apply them on control problems. Dynamical systems might have discrete action-space like cartpole where two possible actions are +1 and -1 or continuous action space like linear Gaussian systems. Usually, you can find a code for only one of these cases. It might be not obvious how to extend one to another.

In this repository, we will explain how to formulate PG and Q-learning for each of these cases. We will provide implementations for these algorithms for both cases as Jupyter notebooks. You can also find the pure code for these algorithms (and also a few more algorithms that I have implemented but not discussed). The code is easy to follow and read. We have written in a modular way, so for example, if one is interested in the implementation of an algorithm is not confused with defining an environment in gym or plotting the results or so on. The theoretical materials in this repo is summarized in a handout which is available in ArXiv. Click here to access the handoutThe handout can be downloaded from here

Citing this repo

Here is a BibTeX entry that you can use to cite the handout in a publication:

@misc{yaghmaie2021crash,
      title={A Crash Course on Reinforcement Learning}, 
      author={Farnaz Adib Yaghmaie and Lennart Ljung},
      year={2021},
      eprint={2103.04910},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

If you use this repo, please consider citing the following relevant papers:

How to use this repo

This repository contains presentation files and codes.

The presentation files are related to the LINK-SIC workshop on Reinforcment Learning. The first day will be Friday March 12, 2021, 13.15 - 16.30, and the second day will be Tuesday April 6, 2021, 13.15 - 16.30. You can find the presentation files in pdf in the folder presentation.

The code is given as Jupyter notebooks and python files. If you want to run Jupyter notebooks, I suggest to use google colab. If you want to extend the results and examine more systems, I suggest to clone this repostory and run on your computer.

Running on google colab

  • Go to [https://colab.research.google.com/notebooks/intro.ipynb] and sign in with a Google acount.
  • Click File, and Upload notebook. If you get the webpage in Swedish, click Arkiv and then Ladda upp anteckningsbok.
  • Select github and paste the following link [https://github.com/FarnazAdib/Crash_course_on_RL.git].
  • Then, a list of files with type .ipynb appears. They are Jupyter notebooks. Jupyter notebooks can have both text and code and it is possible to run the code. As an example, scroll down and open pg_on_cartpole_notebook.ipynb.
  • The file contains some cells with text and come cells with code. The cells which contain code have $[]$ on the left. If you move your mouse over $[ ]$, a play box appears. You can click on it to run the cell. Make sure not to miss a cell as it causes fatal errors.
  • You can continue like this and run all code cells one by one up to the end.

Running on local computer

Where to start

The theoretical materials in this repo is nicely summarized in our handout in pdf format available at https://arxiv.org/abs/2103.04910. If you wish to read the materials in this repo, you can start by reading about Reinforcement Learning

Dynamical systems

You can read about dynamics systems (or environments in RL terminology) that we consider in this repo here.

Policy Gradient

Policy Gradient is one of the popular RL routines that relies upon optimizing the policy directly. Below, you can see jupyter notebooks regarding Policy Gradient (PG) algorithm

You can also see the pure code for PG

Q-learning

Q-learning is another popular RL routine that relies upon dynamic programming. Below, you can see jupyter notebooks regarding Q-learning algorithm

You can also see the pure code for Q- and experience replay Q-learning

Presentation files

The presentation files for the LINK-SIC workshop can be downloaded from the folder called presentation. There, you can find the presentation files for day1 and day2.

About

This is a self-contained repository to explain two basic Reinforcement (RL) algorithms.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published