Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis of data with multiple treatment regions #320

Open
drbenvincent opened this issue Apr 23, 2024 · 0 comments · May be fixed by #338
Open

Analysis of data with multiple treatment regions #320

drbenvincent opened this issue Apr 23, 2024 · 0 comments · May be fixed by #338
Assignees
Labels
documentation Improvements or additions to documentation feature request

Comments

@drbenvincent
Copy link
Collaborator

drbenvincent commented Apr 23, 2024

We already have the ability to run geo-testing (with synthetic control methods), but this assumes that we have just a single treated region.

But often we may have (or plan to have) multiple regions which receive treatment. This treatment may either be the same, or different, if it is different, it could be different in kind (e.g. store refurbishment vs price discounts) or degree (level of price discount).

Below are notes which sketch out some approaches we could take here:

1. Individual synthetic control analyses for each treated geography

It would be perfectly valid to treat each treated region as its own experiment. For each treated region, the set of control regions would be the complete set of remaining untreated regions. I don't see any problem in a control region being used as a donor in multiple synthetic control experiments.

This method may make most sense if the treatments were different in kind or magnitude. That is, when we do not expect the effects of the intervention to be similar in each treated region.

Optionally, if we have a natural hierarchy then we can choose the control regions based upon this. For example, if we have one test store per geographical state, then we could use the relevant store data for each of the separate tests.

2. Aggregated synthetic control

If the treatments are similar and we expect the effects of the intervention to be similar across treated regions, then it may make sense to create a new aggregate (e.g. mean or median) test region

TODO

This issue can be closed by the addition of a new notebook example which demonstrates each of these approaches. It should operate on simulated data where we simulate known intervention effects on multiple regions.

Though it could also be useful to create new functionality, so a function or class called something like MultiCellSyntheticControl and we can pass kwargs to provide the test region and method (independent vs aggregated)

@drbenvincent drbenvincent self-assigned this Apr 23, 2024
@drbenvincent drbenvincent added the documentation Improvements or additions to documentation label Apr 23, 2024
@drbenvincent drbenvincent linked a pull request May 7, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant