Skip to content

SMEP D: Multiple Inheritance and Modifier Mixins for LikelihoodModels

Josef Perktold edited this page May 8, 2015 · 3 revisions

SMEP-D: Multiple Inheritance and Modifier Mixins for LikelihoodModels

The Proposal

Author: Josef Perktold (because the text is still in first person)

(originally posted to mailing list https://groups.google.com/d/msg/pystatsmodels/NPY5ZydG6cI/l4tCPlr7-5QJ )

We are making extensive use of super calls but have avoided multiple inheritance so far so we don't get the additional code complexity from figuring out which class is used when within the super sequence.

We use some simple Mixins and multiple inheritance in the test suite.

I attended Hettinger's talk on super at pycon, and I think we can safely extend our pattern to at least one level of multiple inheritance.

class PoissonMixed2(MixedMixin, Poisson):
    pass

>>> for i in mod.__class__.mro(): print(i)
... 
<class '__main__.PoissonMixed2'>
<class 'statsmodels.base.mixed.MixedMixin'>
<class 'statsmodels.discrete.discrete_model.Poisson'>
<class 'statsmodels.discrete.discrete_model.CountModel'>
<class 'statsmodels.discrete.discrete_model.DiscreteModel'>
<class 'statsmodels.base.model.LikelihoodModel'>
<class 'statsmodels.base.model.Model'>
<class 'object'>
                            ...
                             |
                      LikelihoodModel
                             |
                            ...
                             |
MixedMixin      - >       Poisson
    |
PoissonMixed

That's similar to an example that Hettinger showed, where MixedMixin is modifying a method from the other part Poisson to LikelihoodModel.

The method resolution order is still simple and easy to figure out. This is full multiple inheritance, not just adding Mixins that define non-overlapping sets of methods that we discussed before.

I used the same pattern for a PenalizedMixin

class PoissonPenalized2(PenalizedMixin, Poisson):
    pass

In both cases the Mixin is modifying the likelihood model of the base likelihood model.

The penalized mixin is simpler, it just adds the penalization term to loglike, score and hessian, and will also need to do it for extra methods, scoreobs, loglikeobs and so on.

  def loglike(self, params):
      llf = super(PenalizedMixin, self).loglike(params)
      return llf - self.pen_weight * self.penal.func(params)

The MixedMixin is more complicated. It takes the super loglikeobs, e.g. of Poisson, and integrates it over the random effects and aggregates for each group in a panel, longitudinal or cluster setting.

This could be nested, although I haven't tried it yet:

class PoissonMixedPenalized(PenalizedMixin, MixedMixin, Poisson):
    pass

and we have penalized maximum likelihood for Poisson model with cluster specific random effects.

  • Poisson provides the underlying distribution,
  • MixedMixin integrates and aggregates and
  • PenalizedMixin adds a penalty term.

PoissonPenalizedMixed has the submodels including Poisson as special cases, and would be the only one that we really need, except for complex signatures.

And, in two more lines we can do the same for GLM

class GLMMixedPenalized(PenalizedMixin, MixedMixin, GLM):
    pass

(inherited classes are listed in reverse of the list in the name, according to post-fixing convention)

The only thing I didn't like so much about the pattern is that we need and get a large number of new classes instead of adding new methods and extensions to existing classes.

Discussion

> <Kerby> Will the penalized mixin override fit, or provide a separate fit_regularized method?

Neither, either or both.

We can always add extra methods that are not in the inheritance chain for either internal use or user facing, like fit_regularized or _fit_regularized.

However, the user will always have the fit method available whether inherited or modified, so it needs to work and will have to be part of the user facing API.

Right now my PenalizedMixin does neither, I just use the inherited fit method, which calls the standard optimizer and creates the standard model specific results class. (To clean up my PenalizedMixin, I will have to override fit at least for GLM because it will not work with method='irls' which is still our default.)

We need to override and modify the inherited fit method, (i) if we don't use the default optimizers, (ii) if we want to adjust the created results instance or (iii) if the inherited methods don't work without changes.

For example, if you only want to provide regularized fit with a special optimizer, then you could just name you fit_regularized as fit and it replaces the inherited fit, or add a switch between optimizers (similar to GLM.fit).

As example for not working `fit`: MixedMixin adds additional parameters to the params, and some inherited methods won't work. I need to override the inherited predict to strip the extra parameters for the super().predict call, and I have to override fit because the default start_params don't have the extra parameters.

I think, that we should provide several user API fit_xxx only if they provide clearly distinct functionality, like currently fit is unregularized, while fit_regularized has the penalized fit (and in discrete_models also returns a different results class.) Another possible reason to provide a second official fit function is if the signature is very different

My guess for the specific case with elastic net:

If we provide special XXXPenalized classes, then it would be better to override fit and delegate to an internal _fit_elasticnet or _fit_regularized method. My main worry is about what attributes we have to attach to the model and whether they could get out of sync if we have several fit methods. It might be easier to have one main fit method that is in "control" overall.

But, I don't have a very strong opinion about this yet. I have to go through the standard fit channel because I'm using the standard inherited optimizers. For elastic net, both ways can be made to work.

I'm getting more convinced of this (override fit and delegate to _fit_xxx).

We will have to attach penalization weight and penalty function to the model, at which loglike, score and similar are evaluated by default. Otherwise, we would have to adjust several of the default methods and cached attributes in the Results. If we allow two different user facing methods to change this, then it might become easy to get conflicting behavior. (If penalization parameters and penalization weight are mutable, i.e. not fixed in __init__, we still run into the possibility of "stale state", that we haven't removed from all models yet.)

On the positive side: fit can provide common things like input checking and result creation or modification, that doesn't need to be in every _fit_xxx method.