Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1 #12552

marcelnem · 2016-03-07T12:16:03Z

I want to interpolate (upscale) nonequispaced time-series to obtain equispaced time-series.

Currently I am doing it in following way:

take original timeseries.
create new timeseries with NaN values at each 30 seconds intervals ( using resample('30S').asfreq() )
concat original timeseries and new timeseries
sort the timeseries to restore order of times (This I do not like - sorting has complexity of O = n log(n) )
interpolate
remove original points from the timeseries

is there a more simple way? like in matlab you have original timeseries and you pass new times as a parameter to the interpolate() function to receive values at desired times. Ideally I would like to have a function such as

origTimeSeries.interpolate(newIndex=newTimeIndex, method='spline')

I remark that times of original timeseries might not be be a subset of the times of desired timeseries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

values = [271238, 329285, 50, 260260, 263711]
timestamps = pd.to_datetime(['2015-01-04 08:29:4',
                             '2015-01-04 08:37:05',
                             '2015-01-04 08:41:07',
                             '2015-01-04 08:43:05',
                             '2015-01-04 08:49:05'])

ts = pd.Series(values, index=timestamps)
ts
ts[ts==-1] = np.nan
newFreq=ts.resample('60S').asfreq()

new=pd.concat([ts,newFreq]).sort_index()
new=new.interpolate(method='time')

ts.plot(marker='o')
new.plot(marker='+',markersize=15)

new[newFreq.index].plot(marker='.')

lines, labels = plt.gca().get_legend_handles_labels()
labels = ['original values (nonequispaced)', 'original + interpolated at new frequency (nonequispaced)', 'interpolated values without original values (equispaced!)']
plt.legend(lines, labels, loc='best')
plt.show()

The text was updated successfully, but these errors were encountered:

jreback · 2016-03-07T12:28:17Z

use ordered_merge rather than concat and sort
http://pandas.pydata.org/pandas-docs/stable/merging.html#merging-ordered-data

marcelnem · 2016-03-07T12:42:43Z

It would be nice to do it without need of merge altogether since I do not really need the merged time series, I only need the resultant equispaced time series. Is the way I described (enhanced with the ordered_merge) the most efficient way to do such? Maybe using spicy directly would be better then

http://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/interpolate.html#d-interpolation-interp1d
scipy allows to do it in Matlab style, keep the original timeseries, and pass new index to obtain new timeseries.

also I will be working will online data so the original time series will grow and I will need to interpolate the new data and add them to the interpolated (equispaced) time series.

jreback · 2016-03-07T14:01:42Z

this gets you pretty close

In [42]: ts.reindex(ts.resample('60s').asfreq().index, method='nearest', tolerance=pd.Timedelta('60s')).interpolate('time')
Out[42]: 
2015-01-04 08:29:00    271238.000000
2015-01-04 08:30:00    271238.000000
2015-01-04 08:31:00    279530.428571
2015-01-04 08:32:00    287822.857143
2015-01-04 08:33:00    296115.285714
2015-01-04 08:34:00    304407.714286
2015-01-04 08:35:00    312700.142857
2015-01-04 08:36:00    320992.571429
2015-01-04 08:37:00    329285.000000
2015-01-04 08:38:00    329285.000000
2015-01-04 08:39:00    219540.000000
2015-01-04 08:40:00    109795.000000
2015-01-04 08:41:00        50.000000
2015-01-04 08:42:00        50.000000
2015-01-04 08:43:00    260260.000000
2015-01-04 08:44:00    260260.000000
2015-01-04 08:45:00    260950.200000
2015-01-04 08:46:00    261640.400000
2015-01-04 08:47:00    262330.600000
2015-01-04 08:48:00    263020.800000
2015-01-04 08:49:00    263711.000000
Freq: 60S, dtype: float64

mroeschke added Enhancement Timeseries Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Jan 13, 2019

mroeschke added Resample resample method and removed Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Mar 31, 2020

mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Resample resample method Timeseries labels Apr 23, 2021

kopytjuk mentioned this issue Mar 25, 2023

DOC warn user about potential information loss in Resampler.interpolate #52198

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1 #12552

Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1 #12552

marcelnem commented Mar 7, 2016

jreback commented Mar 7, 2016

marcelnem commented Mar 7, 2016

jreback commented Mar 7, 2016

Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1 #12552

Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1 #12552

Comments

marcelnem commented Mar 7, 2016

jreback commented Mar 7, 2016

marcelnem commented Mar 7, 2016

jreback commented Mar 7, 2016