Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seaborn.objects: Add ability to specify which vars to aggregate #3426

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

qiemem
Copy link
Contributor

@qiemem qiemem commented Jul 23, 2023

This PR is more the sketch of an idea to gauge interest/get feedback (albeit a functional one). I've wanted to be able to label the value of aggregated data points. However, currently Agg only targets x or y. Specifying text to a continuous variable thus immediately breaks up aggregation groups. The idea with this PR is to allow the specification of which variables you're targeting for aggregation. For example:

so.Plot(
    {"value": range(10), "group": ["a"] * 5 + ["b"] * 5},
    x="group",
    y="value",
    text="value",
).add(
    so.Bar(), so.Agg(target_vars=["y", "text"])
).add(
    so.Text(), so.Agg(target_vars=["y", "text"])
)

which results in:

image

The biggest downside to this approach is it applies the same aggregation to all variables. Other approaches considered:

  1. Allow func to be a dict from variable names to functions, which solves the above downside.
  2. Specify which variables to aggregate by instead of which to target. This would be more consistent with Dodge, though it doesn't solve the problem of using different aggregations on different variables.

Other downsides include having to re-specify the same aggregation for all marks, but that's easily solved by the user (just assign the agg to a variable).

I'd also like to add similar functionality to other transforms in the stats package, like so.Est and so.PolyFit. The primary application would again be for labeling the values of those things, but I could also imagine it being useful for other things.

As an aside, I cannot believe how easy this was to implement! What a lovely codebase to work with.

@codecov
Copy link

codecov bot commented Jul 23, 2023

Codecov Report

Merging #3426 (e0953e0) into master (3ef2a0f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3426   +/-   ##
=======================================
  Coverage   98.32%   98.32%           
=======================================
  Files          77       77           
  Lines       24303    24305    +2     
=======================================
+ Hits        23897    23899    +2     
  Misses        406      406           
Impacted Files Coverage Δ
seaborn/_core/plot.py 98.26% <100.00%> (+<0.01%) ⬆️
seaborn/_stats/aggregation.py 97.56% <100.00%> (+0.06%) ⬆️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant