Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Real or technical artifacts due to errors detected in UMAP following run.flowsom()? #161

Closed
denvercal1234GitHub opened this issue Apr 27, 2023 · 3 comments

Comments

@denvercal1234GitHub
Copy link

denvercal1234GitHub commented Apr 27, 2023

Hi there,

Thanks for the package.

When I used run.flowsom even with meta.k=20 (automatic meta.k resolved 11 clusters), the cluster UMAP doesn't look "right" -- they clumped together on 1 side of the UMAP, and there two tiny clusters that are so separated from everyone.

Do you think these 2 clusters (6 and 20) are simply artifacts? If so, would you mind suggesting some ways to assess that?

Thank you for your help!

Screenshot 2023-04-27 at 12 07 09

Related to #154.

@tomashhurst
Copy link
Member

Hi @denvercal1234GitHub hmm I get this on occasion -- the issue is to do with the UMAP calculations, which should be independent from FlowSOM clustering (i.e., even if you change the FlowSOM clustering, you will still get the same UMAP arrangement).

You might find that those cells on the far right are some kind of cells that are stacked on the maximum value across multiple channels, perhaps some kind of antibody aggregate bound to some cells or something similar. It would be worth doing some plotting to find out. If it is something technical like this then you could filter out those cells prior to analysis?

Tom

@denvercal1234GitHub
Copy link
Author

thanks @tomashhurst ! This is very useful. Would you mind elaborating a bit more on which plotting would you be doing to investigate whether these are cells of technical artifact? These 2 populations are actually what we expect to detect in our data based on RNAseq of the same samples.

@tomashhurst
Copy link
Member

@denvercal1234GitHub sorry for the delayed reply here -- essentially an nxn plot -- so CD3 vs CD4, CD3 vs CD19, CD3 vs N etc. In each, plotting the metacluster. You could also create a heatmap to look at what markers are expressed most highly using make.pheatmap.

There will likely be some group of cells that may have ultra-high expression on a set of markers, possibly as these might be some kind of aggregate. You could also plot them on tSNE of FItSNE which should squish them into the 2D plot a bit better which might allow you to see how they look compared to the remaining cells in the dataset.

I'll close this for now, but let us know how you get on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants