Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better priors for a log-link function with low frequency #643

Open
omrihar opened this issue Feb 14, 2023 · 6 comments
Open

Better priors for a log-link function with low frequency #643

omrihar opened this issue Feb 14, 2023 · 6 comments

Comments

@omrihar
Copy link

omrihar commented Feb 14, 2023

I'm trying to work with bambi models for Poisson GLMs with relatively low frequency (insurance claims data). The priors that are automatically generated for me tend to have rather large sigmas (can get to the order of 10s). Because the parameters get exponentiated, this implies very large frequencies (sigma = 50, at 1-sigma implies 5e21 at the exponentiated scale). This invariably leads to the ValueError: lam value too large, even with purely prior predictive checks.

Currently I deal with this by setting by hand all of the priors, but it would be nice to adapt the automatic prior selection to take care of the link function. Either that, or have the ability to specify a maximum to sigma that would apply to all the priors.

I scanned through the issues and couldn't find anything similar to this, so apologies if this is a duplicate.

Thanks!
Omri

@tomicapretto
Copy link
Collaborator

This is not a duplicate. And thanks for raising the issue!

I don't have a solution in my mind right now. Let's keep it open until we improve it.

@omrihar
Copy link
Author

omrihar commented Feb 15, 2023

I didn't dive into the details of the automatic prior selection yet, but perhaps simply applying the link function could be a good start? I'm assuming that the scale of sigma comes from the scale of the data somehow, so, if we're using the log-link, what about simply logging the suggested sigma? This would already bring the enthusiastic automatic estimates down by quite a lot, and would keep the relationship with the data at the linear predictor scale...

@tomicapretto
Copy link
Collaborator

You're right, the scale of sigma comes from the data. It's inversely proportional to the standard deviation of the predictor. My initial reaction is to think applying the log-link function shouldn't make any harm. But I should test the idea better.

@omrihar
Copy link
Author

omrihar commented Feb 19, 2023

Of course it's just an idea - but that's what I usually try first when the prior predictive checks fail due to lam too large. Just taking the log of whatever I thought the standard deviation should be.
Since I'm working with very small frequencies in a Poisson GLM, it's really tricky to get that right without looking at the right scale.

@omrihar
Copy link
Author

omrihar commented Mar 23, 2023

Are there any plans to implement this in the near-ish future? If not, could you perhaps guide me to where the priors are set in the code so I can attempt to at least locally change that, and perhaps, if good enough, may provide a PR?
This is currently stopping me from using bambi (I use PyMC instead), because it's quite tedious to go through all the priors and set them by hand, and bambi does allow me to build much more complex models faster...

Thanks!

@tomicapretto
Copy link
Collaborator

tomicapretto commented Mar 23, 2023

Hi @omrihar, I'm not working on this aspect of Bambi right now. But if you want to give it a try, here I have some directions.

Everything related to prior scaling happens in bambi/priors/scaler.py.

Notice there are different scaling stages/functions for the different types of terms (e.g. intercept, common, and group-specific).

Notice PriorScaler has access to the model, and thus to the family, and thus to the link function. So you could check whether you're using a Poisson family (or any other family) with a "log" link function and adjust the standard deviation of the priors accordingly.

In particular,

this is where the scale of the prior for common terms is computed

def get_slope_sigma(self, x):
return self.STD * (self.response_std / np.std(x))

and this is where you get both mu and sigma for the intercept

def get_intercept_stats(self):
mu = self.response_mean
sigma = self.STD * self.response_std
# Only adjust mu and sigma if there is at least one Normal prior for a common term.
if self.priors:
sigmas = np.hstack([prior["sigma"] for prior in self.priors.values()])
x_mean = np.hstack(
[self.response_component.terms[term].data.mean(axis=0) for term in self.priors]
)
sigma = (sigma**2 + np.dot(sigmas**2, x_mean**2)) ** 0.5
return mu, sigma

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants