Skip to content

tee-lab/PyDaddy

Repository files navigation

PyDaddy

A Python package to discover stochastic differential equations from time series data.

Documentation Status

PyDaddy is a comprehensive and easy to use python package to discover data-derived stochastic differential equations from time series data. PyDaddy takes the time series of state variable $x$, scalar or 2-dimensional vector, as input and discovers an SDE of the form:

$$ \frac{dx}{dt} = f(x) + g(x) \cdot \eta(t) $$

where $\eta(t)$ is Gaussian white noise. The function $f$ is called the drift, and governs the deterministic part of the dynamics. $g^2$ is called the diffusion and governs the stochastic part of the dynamics.

An example summary plot generated by PyDaddy, for a vector time series dataset.

PyDaddy also provides a range of functionality such as equation-learning for the drift and diffusion functions using sparse regresssion, a suite of diagnostic functions, etc.. For more details on how to use the package, check out the example notebooks and documentation.

Schematic illustration of PyDaddy functionality.

Getting started

PyDaddy can be executed online on Google Colab, without having to install it on your local machine. To run PyDaddy on Colab, open a notebook on Colab. Paste the following code on a notebook cell and run it:

%pip install git+https://github.com/tee-lab/PyDaddy.git

This sets up PyDaddy in the notebook environment.

There are several example notebooks provided, which can be used to familiarize yourself with various features and functionalities of PyDaddy. These can be executed on Colab.

  • Getting started: Introduction to the basic functionalities of PyDaddy, using a 1-dimensional dataset.
  • Getting started with vector data: Introduction to the basic functionalities of PyDaddy on 2-dimensional datasets.
  • Advanced function fitting: PyDaddy can discover analytical expressions for the drift and diffusion functions. This notebook describes how to customize the fitting procedure to obtain best results.
  • Recovering SDEs from synthetic time series: This notebook generates a simulated time series from a user-specified SDE, and uses PyDaddy to recover the drift and diffusion functions from the simulated time series.
  • Exporting data: Demonstrates how to export the recovered drift and diffusion data as CSV files or Pandas data-frames.
  • Fitting non-polynomial functions: PyDaddy fits polynomial functions to drift and diffusion by default. This behaviour can be customized, this notebook illustrates how to do this.

There are also two notebooks that use PyDaddy to discover SDEs from real-world datasets.

Installation

PyDaddy is available both on PyPI and Anaconda Cloud, and can be installed on any system with a Python 3 environment. If you don't have Python 3 installed on your system, we recommend using Anaconda or Miniconda. See the PyDaddy package documentation for detailed installation instructions.

Using pip

PyPI PyPI - Wheel PyPI - Status

To install the latest stable release version of PyDaddy, use:

pip install pydaddy

To install the latest development version of PyDaddy, use:

pip install git+https://github.com/tee-lab/PyDaddy.git

Using anaconda

To install using conda, Anaconda or Miniconda need to be installed first. Once this is done, use the following command.

conda install -c tee-lab pydaddy

Documentation

For more information about PyDaddy, check out the package documentation.

Citation

If you are using this package in your research, please cite the repository and the associated paper as follows:

Nabeel, A., Karichannavar, A., Palathingal, S., Jhawar, J., Danny Raj, M., & Guttal, V. (2022). PyDaddy: A Python Package for Discovering SDEs from Time Series Data (Version 1.1.0) [Computer software]. https://github.com/tee-lab/PyDaddy

Nabeel, A., Karichannavar, A., Palathingal, S., Jhawar, J., Danny Raj, M., & Guttal, V. (2022). PyDaddy: A Python package for discovering stochastic dynamical equations from timeseries data. arXiv preprint arXiv:2205.02645.

Licence

PyDaddy is distributed under the GNU General Public License v3.0.