Skip to content
ChadFulton edited this page Jul 26, 2013 · 7 revisions

Implementation Notes

Estimation: Hansen vs Tsay vs (???)

There are several approaches to estimating SETAR models, which seem to vary in their ability to estimate the hyperparameters and in their ability to calculate high order SETAR models.

Tsay's approach, which I haven't pursued, emphasizes graphical analyses to identify e.g. thresholds, whereas Hansen's approach allows them to be estimated.

On the other hand, Hansen only covers in detail 2 and 3 order models. 4th order models would work even in the current implementation, but we run into a curse of dimensionality problem: the grid space gets increasingly larger as the order increases because the number of threshold parameters to include in the grid is what is increasing.

Practically, it doesn't seem to me that there is much demand for 4th order or greater SETAR models, but perhaps that's not a good reason for waiving them away.

Hansen's Approach

7/25/13: I've used Hansen's approach and was able to figure out how he generalizes the SETAR(2) objective function to any SETAR(j) model. This implementation allows any SETAR(j) model as well as allowing estimation of the hyperparameters.

Unit testing

Unfortunately the tsDyn.setar model has not yet reproduced the results I'm getting. It may be that I will need to test against another package somewhere (perhaps Hansen's code?).

Econometrics TODO

  • Implement grid search for delay and thresholds hyperparameters
  • Add test statistic calculation, and bootstrapping framework
  • Add model selection, SETAR(j), test (just application of doing to bootstrapping to compute the p-value of the test stat)
  • Apply white's correction to the standard errors
  • Add prediction/forecasting
  • Consider adding the heteroskedasticity case, have to be careful about assumptions on functional form of conditional variance

Statsmodels TODO

  • See TODOs in file
  • Add results class
  • General cleanup (pep8, docstrings, etc.)
  • What is appropriate parent class?
  • How best to do the estimation? Right now I call a new OLS class, but maybe better to use the parent class' fit function instead?
  • Since grid search and bootstrapping will require estimating the OLS many times from within the class, this may be a reason why using the parent class' fit function wouldn't necessarily be the right answer here?
  • How to use dates, etc (or maybe it's only used for forecasting?)
  • etc.

Implementation TODO

  • Is Hansen's implementation be fast enough? For how high a SETAR order? How many obs? Is my implementation of Hansen fast enough?

7/25/13: The order of hyperparameter selection algorithm is (max_delay * grid_threshold_size + (order-2) ^2* grid_threshold_size). In practice, on my MacBook Air, it takes ~0.25 ms to do one iteration or about 277 ms for a 2-regime AR(11) model with a grid size of 100, or ~0.40 ms to do one iteration or about 474 ms for a 3-regime AR(11) model with a grid size of 100 (higher per-iteration costs as SETAR order increase because of iterative threshold calculations).

  • Results Class : Most likely the results will be a combination of RegressionResults and ARResults. ARResults has several methods specific to the implementation in the AR model. One possibility would be to pull this out into another intermediate results class. The AR model is maximum likelihood based, while the SETAR, TAR will be least squares based. However, that mainly means that there is less in common in the Model class, large parts will still be the same in the Results class.

Directed search:

7/25/13: I spent a while trying directed search, but with my limited knowledge of directed searches I wasn't able to make it even feasible, much less speedy. The problem is twofold:

  1. Large flat spaces: the objective function has large flat spaces across the delay and threshold parameters. Obviously since the delay is discrete, w.r.t delay the objective function is flat except across across the discrete changes. Since the threshold parameter only matters insofar as it cuts between observations, w.r.t threshold the objective function is flat except across observations' values.

  2. Eratic: the objective function is essentially the SSR from the various estimated models across the possible delays and thresholds. Observationally, this has many local minima and maxima, and the global max or min may only be marginally higher or lower than another very different specification (see e.g. the differences between TestSunspotsSETAR2 and TestSunspotsSETAR2Search in test_setar.py, which are the same except for the size of the threshold grid).

Extensions TODO

Generalize to TAR

I don't think this should be too hard, since (I think) all it requires is that the threshold variable be specified. So SETAR could subclass TAR, and just give it the lag according to the delay.

Integrate with STAR

It's possible that we will want to do STAR -> TAR -> SETAR as the class hierarchy, since I believe that all are in the end estimated with conditional least squares, and they just depend on the transition function and the nature of the threshold variable. This will become more clear as I work on the STAR model.

Find nice ways to integrate nonlinear time series into the linear time series framework

Most introductions to nonlinear time series emphasize that nonlinear models are sufficiently more complicated (and more hassle) that if you can get away with a linear specification, you should. With that in mind, it seems like it would be nice to have some kind of smooth integration between AR, SETAR, TAR, and STAR, especially since they are technically nested.

Normally I'd think of subclassing (e.g. STAR -> TAR -> SETAR -> AR), but the theory for AR is so much farther advanced that I don't think it would ever make practical sense to actually do that.

Possibly some kind of wrapper class, that centralized all the nonlinear testing functionality along with the different modeling approaches?

Integrate with supLR

Test statistic is actually a supLR statistic. It would be good to make the supLR test general enough to cover this case.

Objective Function (Sunspot Model)

3D Plot of Objective Function

Individual Plots of Objective Function

Clone this wiki locally