Each week, I release a new version in order to track the progress of my work.
The pdf files are meant to be a summary of the most important notions studied during lectures.
lecture_1.pdf
sums up the notions of Markov Chains, Value State Function and Optimal Policies studied during lecture 1.lecture_3.pdf
sums up the most important notions from the Utility Theory lecture and the main results of the Portfolio application problems for CARA and CRRA utility functions.RL.pdf
contains some RL theorems and properties with their proofs.
problem1.ipynb
(Merton Application problem lecture summary + code application)problem2.ipynb
(Option pricing lecture summary + code illustration)
- processes
This folder contains Python files for the implementation of Markov Processes, Markov Reward Processes and Markov Decision Processes. All these processes are modelled as Python class
. Here, the objective is to define objects that will be used in Dynamic Programming and Reinforcement Learning algorithms. The structure of these classes is incremental where _MP_
class is the basis for all other processes.
-
DP (Main Dynamic Programming algorithms)
-
RL (Main Reinforcement Learning Algorithm)
-
algorithms for prediction
-
algorithms for control
-
algorithms for value approximation (prediction only)
-
-
option_pricing.py
(code for pricing European and American options) -
run_predicitions.py
(code for running predictions with DP and RL algorithms) -
utils
-
Helper code for the algorithms above
-
sampling.py
-
Functions to generate sequence of episodes given a MDP and Policy.
Discrete Markov chains are implemented as Python class
. The data that feed these objects are stored as dict
. Let us give some examples :
- MP :
- MRP :
- Policy :
- MDP :
- In order to illustrate the methods and attributes of a
MP
object, let us run :python3 mp.py
Then the output is :
This folder contains a Python file policy.py
for a Policy implementation. This class is used in the class _MDP_
. It also contains a file det_policy.py
used in policy improvement method (method in MDP
objects).