Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for "Other Intersection" #88

Open
nhartwic opened this issue Jan 10, 2020 · 6 comments
Open

Support for "Other Intersection" #88

nhartwic opened this issue Jan 10, 2020 · 6 comments

Comments

@nhartwic
Copy link

I've started playing around with this module. All in all, it works quite well. I've been using it to represent the major intersections of large (20) numbers of categories. Its quite easy to plot the top 25 intersections for my set of categories and this style makes it easy to identify the significant intersections. Unfortunately, this means that some fraction of my data is always left off of the plot. Ideally, I'd capture this information with a kind of null category representing every other intersection not represented by my figure.

upset_small_fig

For example, I generated the above image for a subset of my categories that are particularly interesting to me. This image works great, but to make it I had to throw out 2^8 - 25 valid combinations of my categories. The total number of orthogroups associated with those intersections is 1613. Ideally, I'd be able to represent these using a special "Other Intersection" at the end of the figure.

Ideally, I think I'd like to see the plot function updated to include a "max_intersections" parameter. Only the largest, or perhaps the first, "max_intersections" number of intersections should be drawn in the figure and then an additional "other intersection" column can be drawn to represent the rest of the data that was passed in.

@jnothman
Copy link
Owner

Do you think a remainder column is particularly useful?

@jnothman
Copy link
Owner

What would it look like visually?

@jnothman
Copy link
Owner

I'm worried about the remainder being very large in some cases.

@nhartwic
Copy link
Author

nhartwic commented Jan 10, 2020

It is useful in that it represents how much of your dataset isn't in any of the columns you are currently looking at.

In terms of what it would look like, it would be another column at the end of the graph. The hardest thing to figure it visually is how to label it as being "every intersection not otherwise covered in this figure". This is just a quick mock up. I'm not in love with it...

mockup_upset_small_fig name

The closest I can get to what I'm after right now is something like...

test

...Doing this takes a good bit of munging though. The following function works ok...

default_title = "Counts of Orthogroups for Sets of Samples"

def clip_intersection_counts(s, max_cols, name="Other Intersection"):
    """ remove small intersections from s and add a single virtual intersection
        to represent them
        
        Input -
        s :
            A pandas series with a multilevel named index that maps from
            categories to counts for that category
        max_cols:
            the number of intersections to clip to.
        name:
            the name to give to the added virtual intersection
        
        >>> import upsetplot
        >>> counts = upsetplot.generate_counts()
        >>> clipped_counts = clip_intersection_counts(s, 4)
        >>> upsetplot.plot(clipped_counts)
    """
    if(max_cols < len(s)):
        bigs = s.nlargest(max_cols)
        smalls = s.nsmallest(len(s) - max_cols, keep="last")
        new_idx = list(s.index.names)
        new_idx.append(name)
        bigs = {t + (False,): v for t, v in bigs.items()}
        bigs[tuple([False] * len(s.index.names) + [True])] = sum(smalls)
        s = pandas.Series(bigs)
        s.index.names = new_idx
    return s

@ebnguyen
Copy link

hello both,

I am quite new to coding and I have successfully made my first graph with upsetplot! thanks a lot @jnothman!

I am just trying to finalize some aspect details. How do we increase the space for the element titles? I had to truncate mine but I would like to use the full words. Also, @nhartwic , how did you select the first best 25?

thanks for your further help :)

Eve

Capture

@jnothman
Copy link
Owner

I am just trying to finalize some aspect details. How do we increase the space for the element titles?

This issue is not the right place to put your comment... I'm not happy with the rendering you've got there, @ebnguyen... The code is meant to estimate how much space is needed for text, but you've got all sorts of things not showing right. in your diagram. A big gap between the axes and the bars, and totals overflowing at the left of the figure. Could you, by any chance, please post your data and configuration in a new issue so that I can debug?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants