Reinforcement Learning Library


 ________  _______   ___  ___       ___           ___    ___ 
|\   __  \|\  ___ \ |\  \|\  \     |\  \         |\  \  /  /|
\ \  \|\  \ \   __/|\ \  \ \  \    \ \  \        \ \  \/  / /
 \ \   _  _\ \  \_|/_\ \  \ \  \    \ \  \        \ \    / / 
  \ \  \\  \\ \  \_|\ \ \  \ \  \____\ \  \____    \/  /  /  
   \ \__\\ _\\ \_______\ \__\ \_______\ \_______\__/  / /    
    \|__|\|__|\|_______|\|__|\|_______|\|_______|\___/ /     
                                                \|___|/

Reinforcement Learning Library

How to Install

Clone the repository including submodules:

git clone --recurse-submodules -j8 https://github.com/CavenaghiEmanuele/REILLY.git

Build the package with C++ backend and install:

cd REILLY && sudo python3 setup.py install

Legends

empty - Not implemented
✔️ - Already implemented
❌ - Non-existent

Tabular Agents

MonteCarlo

Name	On-Policy	Off-Policy	Python	C/C++
MonteCarlo (First Visit)	✔️		✔️	✔️
MonteCarlo (Every Visit)	✔️		✔️	✔️

Temporal Difference

Name	On-Policy	Off-Policy	Python	C/C++
Sarsa	✔️		✔️	✔️
Q-learning	❌	✔️	✔️	✔️
Expected Sarsa	✔️		✔️	✔️

Double Temporal Difference

Name	On-Policy	Off-Policy	Python	C/C++
Double Sarsa	✔️		✔️	✔️
Double Q-learning	❌	✔️	✔️	✔️
Double Expected Sarsa	✔️		✔️	✔️

n-step Bootstrapping

Name	On-Policy	Off-Policy	Python	C/C++
n-step Sarsa	✔️		✔️	✔️
n-step Expected Sarsa	✔️		✔️	✔️
n-step Tree Backup	❌	✔️		✔️
n-step Q(σ)

Planning and learning with tabular

Name	Python	C/C++
Random-sample one-step tabular Q-planning		✔️
Tabular Dyna-Q		✔️
Tabular Dyna-Q+		✔️
Prioritized sweeping		✔️

Approximate Agents

Tile coding

Name	Python	C/C++
1-D Tiling	✔️	✔️
n-D Tiling	✔️	✔️
Tiling offset	✔️	✔️
Different tiling dimensions	✔️	✔️

Q Estimator

Name	Python	C/C++
Base implementation	✔️	✔️
With trace	✔️

MonteCarlo

Name	On-Policy	Off-Policy	Python	C/C++
Semi-gradient MonteCarlo	✔️			✔️

Temporal difference

Name	On-Policy	Off-Policy	Differential	Python	C/C++
Semi-gradient Sarsa	✔️			✔️	✔️
Semi-gradient Expected Sarsa	✔️			✔️	✔️

n-step Bootstrapping

Name	On-Policy	Off-Policy	Differential	Python	C/C++
Semi-gradient n-step Sarsa	✔️			✔️	✔️
Semi-gradient n-step Expected Sarsa	✔️			✔️	✔️

Traces

Name	On-Policy	Python
Accumulating Trace	✔️	✔️
Replacing Trace	✔️	✔️
Dutch Trace

Eligibility Traces

Name	On-Policy	Python
Temporal difference (λ)
True Online TD(λ)
Sarsa(λ)	✔️	✔️
True Online Sarsa(λ)
Forward Sarsa(λ)
Watkins’s Q(λ)
Tree-Backup Q(λ)

Environments

GYM Environments

Name	Discrete State?	Discrete Action?	Linear State?	Multi-Agent?
FrozenLake4x4	Yes	Yes	Yes	No
FrozenLake8x8	Yes	Yes	Yes	No
Taxi	Yes	Yes	Yes	No
MountainCar	No	Yes	No	No

Custom Environments

Name	Discrete State?	Discrete Action?	Linear State?	Multi-Agent?
Text	Yes	Yes	No	Yes

Sessions

Name	Multi-Agent?	Joint Train?	Joint Test?
Session	No	No	No
JointSession	Yes	Optional	Yes

Name		Name	Last commit message	Last commit date
Latest commit History 362 Commits
doc		doc
reilly		reilly
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

License

CavenaghiEmanuele/REILLY

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Library

How to Install

Legends

Tabular Agents

MonteCarlo

Temporal Difference

Double Temporal Difference

n-step Bootstrapping

Planning and learning with tabular

Approximate Agents

Tile coding

Q Estimator

MonteCarlo

Temporal difference

n-step Bootstrapping

Traces

Eligibility Traces

Environments

GYM Environments

Custom Environments

Sessions

About

Topics

Resources

License

Stars

Watchers

Forks

Languages