Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outcome variable with highly right-skewed distribution with mass at zero #1375

Open
robert702 opened this issue Nov 20, 2023 · 1 comment
Open
Labels

Comments

@robert702
Copy link

Hello,
I am trying to fit a causal forest on an outcome that has a highly right-skewed distribution and mass at zero. Then I want to sort the sample by deciles of predicted treatment effects and calculate the treatment effect on each decile. This would be done with the standard cross fitted procedure.

My intuition would be that given the characteristics of the outcome variable it would be ideal to use a poisson regression (glm with exponential link) to calculate treatment effects. But I don't think the current grf algorithm allows for that --- please correct me if I am wrong. More generally it is not clear to me how to think about poisson with high dimensional covariate spaces

As a second best I am thinking of training a standard causal forest. But then when it comes to calculating the treatment effects on each decile of predicted treatment effects, then I could use a Poisson regression there. I cannot think of any conceptual issue when doing that, but please correct me if I am wrong.

Then the only remaining challenge is that, if i had a normal outcome, i would calculate the treatment effect of each quintile with the aipw method. But since the outcome is so rightly skewed, I wonder if I can use a version of the aipw in which the outcome is predicted with a poisson regression, instead of the standard aipw procedure which i believe uses a random forest by default.... i.e. i would try to use a poisson based aipw estimator, instead of the standard aipw estimator of the average_treatment_effect function in grf. Would an approach like this make sense? or is there another way to think about rightly skewed outcomes in the context of causal forests?

Thank you,

R

@robert702 robert702 changed the title Highly right-skewed distribution with mass at zero Outcome variable with highly right-skewed distribution with mass at zero Nov 20, 2023
@erikcs
Copy link
Member

erikcs commented Nov 22, 2023

Hi @robert702 , there’s nothing wrong with causal forests and skewed data per se. In some settings, there could be a general identification issue though, for example, if an outcome is extremely rare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants