New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The KDE transform creates values where there are none when used with {"resolve": "shared"}
#3815
Comments
@jheer Do you think this is something that is suitable for implementation on the Vega side of things or does it belong in the Vega-Lite repo? A related issue stemming from this is that setting the x-scaled to |
If I add a Open the Chart in the Vega Editor In my opinion this is something for the VL-repository. |
Thanks @mattijn, setting the transform resolve to independent would fix both the specs above, but it would lead to jagged appearance when having two densities in the same chart as described in this issue vega/vega-lite#9078. So we would either need another way to fix that (maybe setting the steps + the extent?) so that we can use |
After investigating this further, I can give a more comprehensive explanation of what is going on. Here is a single spec that contains both issue. I can't find any combination of parameters that supports each density ending at the min/max value of the data AND being able to have the two grouped/colored densities display properly on top of each other. Step 1: Coloring by one variable and faceting by another. You can see how the lower facet ("Open") is extended all the way to the x-axis min around 3.5, although there are not data points there. (Also note that by default Vega-Lite now stacks areas which is not ideal for distribution densities since it makes them harder to compare, but this is a separate issue vega/vega-lite#9170).
Step 2: I can fix the issue with the extension to zero if I set the resolve to independent and remove the impute transform as you suggested. However, that automatically unstacks the areas (which often is a good default but would be unexpected to someone who explicitly specified a stacked density:
In other words, I can't find a combination of parameters that allows me to create this chart (correctly stacked on top, and correct extent on the bottom): |
It looks like the first case can be resolved with explicitly setting the kde resolve which we add in vega/vega-lite#9172 #3815 (comment) is a bit trickier but could be addressed with a |
I'm working on this now |
If
{"resolve": "shared"}
is set, the extent of grouped density transforms incorrectly use the min/max of the entire dataset instead of for each group, resulting in long lines where there are no observations at all, instead of stopping the density at the last data point in the group. I noticed this in Vega-Lite, but wonder if it could be fixed directly in the KDE transform in Vega instead of doing some post-processing such as dropping zeros in Vega-Lite. I understand that the computation need to happen over the same domain to enable stacking, but would it be possible to trim the densities after that to only include values that exists within each group? This would also be helpful for the violinplot implementation.This chart is created in altair 5.1.2 which uses VL 5.15.1 and shows the undesired behavior:
Open the Chart in the Vega Editor
The desired behavior would look like this where each density is cut at the min/max values of each group:
Altair code
Ref vega/vega-lite#9078
The text was updated successfully, but these errors were encountered: