Skip to content

nrontsis/PILCO

Repository files navigation

Probabilistic Inference for Learning Control (PILCO)

Build Status codecov

A modern & clean implementation of the PILCO Algorithm in TensorFlow v2.

Unlike PILCO's original implementation which was written as a self-contained package of MATLAB, this repository aims to provide a clean implementation by heavy use of modern machine learning libraries.

In particular, we use TensorFlow v2 to avoid the need for hardcoded gradients and scale to GPU architectures. Moreover, we use GPflow v2 for Gaussian Process Regression.

The core functionality is tested against the original MATLAB implementation.

Example of usage

Before using PILCO you have to install it by running:

git clone https://github.com/nrontsis/PILCO && cd PILCO
python setup.py develop

It is recommended to install everything in a fresh conda environment with python>=3.7

The examples included in this repo use OpenAI gym 0.15.3 and mujoco-py 2.0.2.7. Theses dependecies should be installed manually. Then, you can run one of the examples as follows

python examples/inverted_pendulum.py

Example Extension: Safe PILCO

As an example of the extensibility of the framework, we include in the folder safe_pilco_extension an extension of the standard PILCO algorithm that takes safety constraints (defined on the environment's state space) into account as in https://arxiv.org/abs/1712.05556. The safe_swimmer_run.py and safe_cars_run.py in the examples folder demonstrate the use of this extension.

Credits:

The following people have been involved in the development of this package:

References

See the following publications for a description of the algorithm: 1, 2, 3