Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of weights in simulate -> warning? #74

Open
florianhartig opened this issue Aug 24, 2021 · 2 comments
Open

Handling of weights in simulate -> warning? #74

florianhartig opened this issue Aug 24, 2021 · 2 comments

Comments

@florianhartig
Copy link

Hi Matteo,

I am considering switching to mgcViz:::simulate for simulating from gam objects in DHARMa, see florianhartig/DHARMa#309.

One suggestion: when fitting models with weights for other than binomial and gaussian families, I assume that weights are simply applied to the likelihood when fitted, but ignored in the simulations. I think it would be better to throw a warning then (currently, no warning is returned).

Cheers,
F

@mfasiolo
Copy link
Owner

Hi Florian,

I was looking at this and the weights are not ignored, but passed to family$rd() or family$qf()...
For instance:

> gaulss()$rd
function (mu, wt, scale) 
{
    return(rnorm(nrow(mu), mu[, 1], sqrt(scale/wt)/mu[, 2]))
}

uses the weights.

Matteo

@florianhartig
Copy link
Author

Yes, for gaussian / binomial, weights have a particular meaning in the likelihood / data-generating model, but for Poisson, the weights are just weights on the likelihood and have no correspondence to any data-generating model (effectively, this is a pseudo-likelihood). In this case, simulated data will not always look like observed data (because the weights cause the fit to disregard particular data points).

So, Effectively, weights in regression packages in R are used in 3 different ways:

  1. control expected dispersion in the likelihood (as in the Gaussian) -> can be simulated from, no problem
  2. weight on the likelihood (e.g. Poisson) -> can't be simulated from, simulations won't fit to the data -> simulate() should throw a warning
  3. the binomial n -> no problem

In retrospect, I think it was a mistake from the R programmers to overload the weight argument in glm with these different meanings, it would have been much better to have separate variable names for all three options.

Anyway, what I would suggest is to throw a warning for all families that are using weights on the likelihood only, without a data-generating model. This is for sure so for the Poisson, not sure about all the other extended families.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants