Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression suggestions #12

Open
Deleetdk opened this issue May 22, 2020 · 3 comments
Open

Regression suggestions #12

Deleetdk opened this issue May 22, 2020 · 3 comments

Comments

@Deleetdk
Copy link

You cover LOESS, polynomials and breakpoint modelling nonlinear patterns, but oddly do not cover splines (breakpoint modelling is a subset of regression splines limited to linear, slope = 0 models) in linear models. These have many advantages. Check out the coverage in https://link.springer.com/book/10.1007/978-3-319-19425-7 (book is on libgen). This book is linked to the rms package, which implements a lot of stuff. See here for an example. https://rpubs.com/EmilOWK/rms_splines. You cover GAMs later on, but don't touch on how the mixing of splines and normal additive/linear terms in models can yield interpretable but powerful models.

For understanding predictions from models, try the ggeffects packages. Example here. https://rpubs.com/EmilOWK/ggeffects_examples, official examples https://strengejacke.github.io/ggeffects/

@Derek-Jones
Copy link
Owner

Thanks for the suggestions.

Not covering splines or saying anything much about polynomial regression is intentional.

The focus of model building is understanding, not prediction (which is why machine learning does not get a look-in).

A polynomial model can be made to fit almost anything (I do give an example). But what can be learned from such a fit?

Isn't teaching people to spline or polynomial fitting giving them a license to fit nonsense models?

@Deleetdk
Copy link
Author

Splines are easy to interpret and don't overfit much, that's what the second part of my comment is about.

@Derek-Jones
Copy link
Owner

How do you give a real-world interpretation to splines?

Are there many processes that change their characteristics slightly, on a recurring basis? There might well be, in which case splines would be applicable.

I see most people using splines to improve the fit of their data, which can be a good thing when making predictions. But given researchers are incentivized to improve model fits, I see the primary us eof splines as rigging the quality of a fitted model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants