Recursive Least Squares with Exponential Forgetting #75

buybackoff · 2016-12-09T17:22:54Z

Rewrite this R implementation in efficient non-allocating .NET code https://gist.github.com/buybackoff/bfeb9c8959157dbf86310d383c93f00d

Should return a structure with coefficients, the current residual value, and EWVariance for the residuals with the same alpha.
Return t-stat for coefficients, R^2, F-stat, etc. We could calculate EWR^2 just by using two EW-variances - one for Y and one for residuals. t-stat could be calculated from xpxi matrix wich is a part of the state.
Serial correlation test could be done as another online regression of current and previous residuals, but should be optional (or we could just use residuals from the output and apply RLSEF on them, then T-stat of the coefficient is DW test).
Test with alpha = 1 and compare to known results for all stats. (R version is OK)

NB since on each step matrix sizes are limited by the number of X variables, manual implementation of mmult/minverse could be cheaper than calling any "optimized" library.

buybackoff · 2016-12-09T19:05:02Z

Code for up to 4x4 is there in C. Above 4 is not practical in reality because we could use the rolling regression only when we are "quite confident" in the model and already have some meaningful interpretation of betas, which we re-estimate in online mode. Have never seen a trustworthy linear regression with more than 2-3 real variables, not dummy ones.

(Interesting, but a theoretical at the moment, question is dummy variables, for which we could monitor rolling t-stats just as a signal that some binary factor becomes significant - there could be many of them. However, this usully works on large data and in exploratory mode while refining the "true" meaningful model.)

https://github.com/niswegmann/small-matrix-inverse
http://stackoverflow.com/questions/1148309/inverting-a-4x4-matrix

buybackoff · 2016-12-09T19:36:16Z

Interesting to estimate the constant factor, at the first glance big O should be NNK vs K^3 just for xpx0.

buybackoff · 2017-01-10T21:22:29Z

It is also possible to make a simple moving regression by applying SMA logic to the elements of cov/var matrices: for a window of Xs and Ys of size M we know what values are removed from the window and we could subtract the contribution from the leaving ones in the same way we add the contribution from the new ones. Simple moving regression will not carry large outliers forever and could be reproduced from any key, while exponential forgetting depends on the starting point.

…erformance test vs non-online algo. Also need weights, probably some multiplier to reduce weights in the similar way to exponential forgetting. But won't do the original exponential version because it depends on the starting point.

buybackoff · 2017-01-27T19:45:32Z

Simple moving regression is done as a proof of concept. Not profiled for allocations, performance, etc., just the betas look like correct, but not compared to exact non-online output. !it is commented out temporarily!!

buybackoff · 2017-01-27T20:32:14Z

The state is not cached, we recalculate lagged xpxi on every move since it is done via zipn, but that is straightforward to fix via manual impl of BindCursor. The math is the same.

buybackoff · 2017-01-27T20:37:25Z

But we do just twice the work of M size, not N, so this is already online also.

buybackoff · 2018-08-16T14:02:28Z

Review dotnet/corefx#31779
Small matrices are the most common case

buybackoff · 2020-04-09T21:20:25Z

https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula
https://en.wikipedia.org/wiki/Recursive_least_squares_filter

Compared to most of its competitors, the RLS exhibits extremely fast convergence. However, this benefit comes at the cost of high computational complexity.

https://en.wikipedia.org/wiki/Woodbury_matrix_identity

This is applied, e.g., in the Kalman filter and recursive least squares methods, to replace the parametric solution, requiring inversion of a state vector sized matrix, with a condition equations based solution. In case of the Kalman filter this matrix has the dimensions of the vector of observations, i.e., as small as 1 in case only one new observation is processed at a time. This significantly speeds up the often real time calculations of the filter.

In the original implementation here only (XX) was calculated in online fashion. But the most expensive part is inversion of that. cblas_?syrk+cblas_?symm` + Sherman-Morrison should be BigO(N)-efficient, but may suffer from native call overhead for small N.

buybackoff added the enhancement label Dec 9, 2016

buybackoff added the math label Jan 23, 2017

buybackoff added a commit that referenced this issue Jan 27, 2017

First take on online simple moving regression #75 (not tested)

ea1748d

buybackoff mentioned this issue Jan 31, 2017

Kalman filter #82

Closed

buybackoff mentioned this issue Jun 18, 2017

Online algorithms #120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recursive Least Squares with Exponential Forgetting #75

Recursive Least Squares with Exponential Forgetting #75

buybackoff commented Dec 9, 2016 •

edited

buybackoff commented Dec 9, 2016

buybackoff commented Dec 9, 2016

buybackoff commented Jan 10, 2017 •

edited

buybackoff commented Jan 27, 2017 •

edited

buybackoff commented Jan 27, 2017

buybackoff commented Jan 27, 2017

buybackoff commented Aug 16, 2018

buybackoff commented Apr 9, 2020

Recursive Least Squares with Exponential Forgetting #75

Recursive Least Squares with Exponential Forgetting #75

Comments

buybackoff commented Dec 9, 2016 • edited

buybackoff commented Dec 9, 2016

buybackoff commented Dec 9, 2016

buybackoff commented Jan 10, 2017 • edited

buybackoff commented Jan 27, 2017 • edited

buybackoff commented Jan 27, 2017

buybackoff commented Jan 27, 2017

buybackoff commented Aug 16, 2018

buybackoff commented Apr 9, 2020

buybackoff commented Dec 9, 2016 •

edited

buybackoff commented Jan 10, 2017 •

edited

buybackoff commented Jan 27, 2017 •

edited