You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This doesn't have a parallel in R, but I think really affects usability in python with pandas.
While multilevel indices aren't tidy data, having multilevel column names doesn't break any tidy data rules, but we can't reference them. Further more they get created a lot in pandas as a result of groupby aggregation, and they are not trivial to get rid of. I don't know the internals, and since there is no R to copy, someone could choose the syntax that is easiest to implement; perhaps the most obvious would be to allow tuples in the aesthetic, since that is the equivalent accessor syntax in pandas.
Example:
df = pd.DataFrame(dict(A = ["A", "B", "C", "D"]*25, X = np.random.random(100), Y = np.random.random(100)))
print(df)
adf = df.groupby("A").agg(["mean", "median", "std"])
print(adf)
# Would be good to now be able to plot X:mean against Y:mean. Something like
ggplot(adf, aes(x = ("X", "mean"), y = ("Y", "mean"))) + geom_point()
# The above doesn't work, instead you have to do something like this:
ggplot(adf, aes(x = adf[("X", "mean")], y = adf[("Y", "mean")])) + geom_point()```
The text was updated successfully, but these errors were encountered:
Dealing with multilevel dataframes in plotnine would complicate the internals, there would be more to it than just treating a tuple as a multilevel column selector. If we try to collapse the levels for the user I think there will be edge cases that lead to wrong output . I have said something about multilevel indices elsewhere.
I use plydata for data manipulation, so I do not run into multilevel dataframes.
This doesn't have a parallel in R, but I think really affects usability in python with pandas.
While multilevel indices aren't tidy data, having multilevel column names doesn't break any tidy data rules, but we can't reference them. Further more they get created a lot in pandas as a result of groupby aggregation, and they are not trivial to get rid of. I don't know the internals, and since there is no R to copy, someone could choose the syntax that is easiest to implement; perhaps the most obvious would be to allow tuples in the aesthetic, since that is the equivalent accessor syntax in pandas.
Example:
The text was updated successfully, but these errors were encountered: