Skip to content
Josef Perktold edited this page Oct 5, 2017 · 5 revisions

Topic Weeks

some ideas for topic weeks, currently incomplete and not sorted (currently biased towards josef-pkt's projects)

These are mainly topics that fit into a week or where a week can provide significant progress

MNLogit and generic multivariate 2d endog support

several things fail in postestimation for MNLogit and VAR, or has model specific code to work around the missing generic support, mainly because of 2-dimensional params

GSOC merge: remaining count data models

last part of count data GSOC 2017 is still open, still needs work followup: similar to followup to GP/NBP and zero-inflated if needed. followup: support margins (see separate topic)

GSOC merge: survey methods

GSOC 2017 is not yet merged, needs review, and possibly more unit tests followup: integration with cov_types

Probability plot - gofplots

postponed for years, This will take me most of a week, it takes time just to think my way through all the option combinations. We need regression tests based on theoretical expectations and visual inspection. There isn't much comparable in terms of the number of options in other packages, and what there is is almost impossible to get out of the graphs, R and Stata don't allow easy access to the numbers that are in a plot.

cov_type refactoring

needs better structure to be able to have more optional keywords describing the model e.g. #2827, weights, ...

stats: rates and proportion

several functions in PRs or draft code that should be relatively easy to merge. Second target figure out what is still missing at the end.

margins

needs partial refactoring and adding to other models. Some of it should be reasonably straight forward such as adding the missing pieces to GLM and GEE. It is less clear what needs to be done to refactor and enhance the generic parts.

predict, get_prediction

get_prediction needs API and design review and be made public. predict_nl is also mainly adding/merging the code. The generic enhancement similar to margin is less clear. (Stata has predict also as part of margins.)

Weights for more models

e.g. discrete models and robust RLM

Treatment effect, IPW

Many pieces are there as prototype code. Needs work to put it into a proper form, functions and classes. Full support for reusing existing models needs prob_weights in those models. second part, another week: causaleffect

Penalization, Mixin, GAM

large topic with several partially separate pieces, difficult to get comparison numbers for unit tests in this area.

  • GAM and Mixin
  • Ridge and standalone functions
  • selection (?)
  • more penalization functions and structure
  • Firth and similar lots of pieces, but some difficult details that will take time. figuring out unit tests will also take time. One possibility is cross-checking different implementations. Another is to "bail out" on some extra results, i.e. declare them as undefined or inverified.

multivariate basic stats

correlation, covariance, hypothesis tests, Hotteling's T, ... robust estimators I (JP) don't have an overview, just bits and pieces in PRs or experimental scripts one target: robust, penalized/shrunk covariance or scatter matrices for reuse in multivariate models

Robust

perennial topic, large number of PRs, fixes and enhancements with very little unit testing Some support in models needs weights iweights, prob_weights or freq_weights. main task: come up with unit tests and decide on some corner cases and just merge it.

...

Clone this wiki locally