MDPSolver

MDPSolver is a Python package for Markov Decision Processes (MDPs) with discounted rewards and infinite-horizon.

Features

Fast solver: Our C++-based solver is substantially faster than other MDP packages available for Python. See details in the documentation.
Three optimization algorithms: Value iteration, Policy iteration, and Modified policy iteration.
Three value-update methods: Standard, Gauss–Seidel, Successive over-relaxation.
Supports sparse matrices.
Employs parallel computing.

Installation

Linux

Install directly from PyPI with:

pip install mdpsolver

After the installation, MDPSolver works out of the box.

Windows

Requires Visual Studio 2022 (17.9) with MSVC C++ compiler and libraries installed.

After installing Visual Studio (incl. MSVC C++ compiler and libraries), install directly from PyPI with:

pip install mdpsolver

Quick start guide

The following shows how to get quickly started with mdpsolver.

Usage

Start by specifying the reward function and transition probabilities as lists. The following is an example of a simple MDP containing three states and two actions in each state.

#Import packages
import mdpsolver

#Rewards (3 states x 2 actions)
#e.g. choosing second action in first state gives reward=-1
rewards = [[5,-1],
           [1,-2],
           [50,0]]

#Transition probabilities (3 from_states x 2 actions x 3 to_states)
#e.g. choosing first action in third state gives a probability of 0.6 of staying in third state
tranMatWithZeros = [[[0.9,0.1,0.0],[0.1,0.9,0.0]],
                    [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                    [[0.2,0.2,0.6],[0.5,0.5,0.0]]]

Now, create the model object and insert the problem parameters.

#Create model object
mdl = mdpsolver.model()

#Insert the problem parameters
mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatWithZeros=tranMatWithZeros)

We can now optimize the policy.

mdl.solve()

The optimized policy can be returned in a variety of ways. Here, we return the policy as a list and print directly in the terminal.

print(mdl.getPolicy())
#[1, 1, 0]

Sparse transition matrix?

mdpsolver has three alternative formats for large and highly sparse transition probability matrices.

(1) Elementwise representation (excluding elements containing zeros):

#[from_state,action,to_state,probability]
tranMatElementwise = [[0,0,0,0.9],
                      [0,0,1,0.1],
                      [0,1,0,0.1],
                      [0,1,1,0.9],
                      [1,0,0,0.4],
                      [1,0,1,0.5],
                      [1,0,2,0.1],
                      [1,1,0,0.3],
                      [1,1,1,0.5],
                      [1,1,2,0.2],
                      [2,0,0,0.2],
                      [2,0,1,0.2],
                      [2,0,2,0.6],
                      [2,1,0,0.5],
                      [2,1,1,0.5]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatElementwise=tranMatElementwise)

(2) Probabilities and column (to_state) indices in separate lists:

tranMatProbs = [[[0.9,0.1],[0.1,0.9]],
                [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                [[0.2,0.2,0.6],[0.5,0.5]]]

tranMatColumns = [[[0,1],[0,1]],
                [[0,1,2],[0,1,2]],
                [[0,1,2],[0,1]]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatProbs=tranMatProbs,
        tranMatColumns=tranMatColumns)

(3) Load the elementwise representation from a file:

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatFromFile="transitions.csv")

Documentation

The documentation can be found in the wiki for MDPSolver (https://github.com/areenberg/MDPSolver/wiki).

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
CPP_Source_Code		CPP_Source_Code
Images		Images
Python		Python
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPP_Source_Code

CPP_Source_Code

Images

Images

Python

Python

.gitattributes

.gitattributes

.gitignore

.gitignore

.gitmodules

.gitmodules

LICENSE

LICENSE

README.md

README.md

Repository files navigation

MDPSolver

Features

Installation

Linux

Windows

Quick start guide

Usage

Sparse transition matrix?

Documentation

How to cite

About

Releases 2

Packages

Contributors 2

Languages

License

areenberg/MDPSolver

Folders and files

Latest commit

History

Repository files navigation

MDPSolver

Features

Installation

Linux

Windows

Quick start guide

Usage

Sparse transition matrix?

Documentation

How to cite

About

Topics

Resources

License

Stars

Watchers

Forks

Languages