Skip to content

Sliding window functions for processing iterative timeseries data in python.

License

Notifications You must be signed in to change notification settings

Jammyjamjamman/PyWinSlide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

PyWinSlide

Sliding window functions for processing iterative timeseries data in python.

This project is still in very early stages. If you need rolling window functions, I first recommend looking at rolling timeseries processing in pandas dataframes .

Why you may/ may not want to use this script.

The reason I'm making this script as an alternative to the pandas rolling window, is that pandas requires the entire timeseries to be loaded into memory. The functions provided in this script allows data to be processed iteratively, which can reduce memory usage when compared to pandas. However, it is significantly slower than pandas at rolling calcuations.

What is supplied.

Currently, a generic Window class is provided and a sliding_window function for using the window class. These are for making new sliding window functions.

Useable functions supplied are:

  • sliding_mean_var(), for calculating the rolling mean and variance within an iterative window.
  • mean_downresample(), for reducing the sample frequency of a timeseries e.g. every minute instead of every second, by taking the mean of the values every second within a minute window.

To create your own sliding window function, create a new class which inherits Window. To get the statistics you want from the window, override the method get_cur_stats(). Use the sliding_window function, and pass your Window class to the Window_cls argument.

Using The Script.

The easiest way to use the script currently, is to place pywinslide.py into the same directory as the script/ jupyter notebook you want to use it with. This is a following example of how to use it:

import pywinslide

"""
The iterator 'timeseries_iter' should yield a tuple of the form (timestamp, number)
every iteration, in ascending time. N.B. The timestamp is a datetime object. The function does not handle
None types or other incorrect types.
"""

# Some timeseries iterator.
timeseries_iter = my_iter

# Create lists of times, means and vars.
times = []
means = []
variances = []
# Create and iterate through a rolling mean and variance function.
# The size of the window is 1 day, which is also the default.
for time, mean, var in pywinslide.sliding_mean_var(timeseries_iter, window_sz=timedelta(days=1)):
    times.append(time)
    means.append(mean)
    variances.append(var)

Other Plans.

  • Add a jupyter notebook, demonstrating using this script.
  • Add more comments to the script.
  • I would really like to make cython versions of these functions in the future. However, I have no experience in using cython.

About

Sliding window functions for processing iterative timeseries data in python.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages