Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute quantiles with groupby #757

Open
kvanderwijst opened this issue Jul 4, 2023 · 2 comments
Open

Compute quantiles with groupby #757

kvanderwijst opened this issue Jul 4, 2023 · 2 comments

Comments

@kvanderwijst
Copy link

It might be nice to be able to compute quantiles after a groupby command, maybe in a syntax similar to this:

(
    df
    .filter(variable="Emissions|CO2")
    .compute.groupby("category")
    .quantiles([0.05, 0.95])
    .plot(color="category", fill_between=True)
)
@danielhuppmann
Copy link
Member

I like the idea! The main question is how to name the model/scenario/variable/region values of the computed data such that the returned object can be again an IamDataFrame.

@znicholls
Copy link
Collaborator

This is how we solved this in scmdata (although ignore the type hints, they're broken): https://github.com/openscm/scmdata/blob/30b8ce9037af634551c9199f411fe5743b8e2e63/src/scmdata/run.py#L1628

We let the user specify the outputs for e.g. model/scenario with the op_col variable and say whether they want to try casting back to an ScmRun object with the as_run variable. You don't have to do it that way of course, you could also always try to cast back to IamDataFrame and use something like op_col to let users pass in the extra data needed to make that conversion valid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants