Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to plot facet grid with hue argument? #135

Open
JohannesWiesner opened this issue Oct 4, 2023 · 5 comments
Open

How to plot facet grid with hue argument? #135

JohannesWiesner opened this issue Oct 4, 2023 · 5 comments

Comments

@JohannesWiesner
Copy link

JohannesWiesner commented Oct 4, 2023

Hi! I am not sure if I am just doing something wrong that's why I am opening another issue here. However, this is related to #120.

I would like to use a FacetGrid plot to give my figure more structure. Specifically, I have a data frame where I would like to test for receptor expression differences between brain regions of interest and non-regions of interest (using the hue-argument). I would like to do this for n different receptors (r1, r2, ... rn). On top, receptors can be assigned to more broad receptor groups which should be visualized as different subplots within my facet grid (one column for each receptor group, but a maximum of 2 columns). See this beautiful hand-made preview:

github

And here's my dataframe:

expression.csv

Is it possible to achieve this with statannotations? I am not sure, because the example does not include the hue-argument and I am not sure if this creates problems. I tried

annot = Annotator(None, pairs)

g = sns.FacetGrid(expression_long, col='receptor_group', height=12, sharey=False)

plot_params = {'x':'expression',
               'y':'receptor',
               'hue':'roi',
               'hue_order':['roi','non-roi'],
               'orient':'h'}

pairs = [((receptor,'roi'),(receptor,'non-roi')) for receptor in expression_long['receptor'].unique()]


g.map_dataframe(annot.plot_and_annotate_facets,
                plot='boxplot',
                plot_params=plot_params,
                configuration={"test": "Mann-Whitney"},
                annotation_func="apply_test")
plt.show()

but this gives me:

ValueError: Missing group value CHRM1in receptor (specified inpairs)

@trevismd
Copy link
Owner

trevismd commented Oct 5, 2023

Yes, statannotations works well with the hue argument in FacetGrid too but pairs are defined at plot level, so the xand hue should be the same across plots, which is not the case for you here.
The "today" solution for you would perhaps be to define subplots to create your desired layout and then use the "regular" plot + stannanotations on each subplot as you'll have different pairs to compare in each one.
(See this post https://www.statology.org/seaborn-subplots/)
Something like this:

annot = Annotator.get_empty_annotator()
plot_params = {
    'x':'receptor',
    'y':'expression',
    'hue':'roi',
    'hue_order':['roi','non-roi'],
}
receptor_groups = expression_long['receptor_group'].unique()
sns.color_palette("Paired")
with sns.plotting_context("paper"):
    fig, axes = plt.subplots(4, 2, figsize=(20,  30))
    for ax_row_idx, ax_row in enumerate(axes):
        for ax_col_idx, ax in enumerate(ax_row):
            ax_idx = ax_row_idx * 2 + ax_col_idx
            if ax_idx >= len(receptor_groups):
                ax.set_axis_off()
                continue
            ax_group = receptor_groups[ax_idx]
            expression_long_group = expression_long.loc[expression_long.receptor_group==ax_group, :]
            group_receptors = expression_long_group['receptor'].unique()

            sns.boxplot(ax=ax, data=expression_long_group, **plot_params)
            annot.new_plot(
                ax,
                data=expression_long_group,
                pairs=[((receptor,'roi'),(receptor,'non-roi')) for receptor in group_receptors],
                plot='boxplot',
                **plot_params
            ).configure(test="Mann-Whitney").apply_and_annotate()
            
            ax.set_title(ax_group)
            if len(group_receptors) > 10:
                ax.set_xticklabels(labels=ax.get_xticklabels(), rotation=45)
plt.show()

Which results in this approximation of your diagram :) Tweaking spacing and legends, groups ordering, maybe using the last row for your larger group (look for add_subplot) should enable you to get there though.

expression

@JohannesWiesner
Copy link
Author

Perfect, thanks so much for the code! Then only issue that I see right now, is that the multiple comparisons correction is now done within each group and not over all receptors right?

@trevismd
Copy link
Owner

trevismd commented Oct 6, 2023

Of course!

This is correct, but it is also the case with plot_and_annotate_facet (I should make that clearer).

Depending on the correction method, you can fix this by either

  • passing a num_comparisons option (like for Bonferonni) or
  • running the stats beforehand and then use set_pvalues instead on each subplot.
    In that case, you'll have to
    1. Compute all the pairs you use in the plots
    2. Plot a chart with receptors of all groups, but using the pairs described above
    3. Collect the pvalues for each pair
    4. Use these when you're making the "real" plot as drafted above.

@JohannesWiesner
Copy link
Author

This is correct, but it is also the case with plot_and_annotate_facet (I should make that clearer).

Ah, interesting! Yes, I think making that clearer would help a lot :)

passing a num_comparisons option (like for Bonferonni)

That sounds like a good idea, but would only work for methods, that do not need to know all the p-values before-hand right?

@JohannesWiesner
Copy link
Author

Of course!

This is correct, but it is also the case with plot_and_annotate_facet (I should make that clearer).

Depending on the correction method, you can fix this by either

  • passing a num_comparisons option (like for Bonferonni) or

  • running the stats beforehand and then use set_pvalues instead on each subplot.
    In that case, you'll have to

    1. Compute all the pairs you use in the plots
    2. Plot a chart with receptors of all groups, but using the pairs described above
    3. Collect the pvalues for each pair
    4. Use these when you're making the "real" plot as drafted above.

Would love if this would work out-of-the-box! The general idea here is that you often want to plot stuff using facet_grid for better readability but you don't want the multiple comparison to be done within each subplot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants