Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster labels when using a hard-coded prior #26

Open
carter-allen opened this issue Apr 15, 2023 · 3 comments
Open

Cluster labels when using a hard-coded prior #26

carter-allen opened this issue Apr 15, 2023 · 3 comments

Comments

@carter-allen
Copy link

In the vignette section Using Priors, the pfit2 object is supposed to have $K = 2$ mixture components, but when you check table(pfit2@label) you find that all observations are assigned to one cluster. However, according to the scatterplot of pfit2, there are 2 components. Is there are reason for the discrepancy?

@gfinak
Copy link
Member

gfinak commented Apr 15, 2023

You ran the example locally and got a different result? The question is: are there reasons for this?
Yes, there are.
flowclust is not optimally maintained. I don't have time to devote to it like I have in the past. The package has seen three different authors and maintainers in its life so far. And the prior code found little use in practice. It is in the end, research code. Lots of it should probably be rewritten in a more modern style.
The scope of use cases where I would trust the package to do work is for identifying populations in fsc /ssc space + a few other markers. That's been most used and best maintained.
Some day I'll get to rewriting it.

@carter-allen
Copy link
Author

Hi, thanks for the response! It is actually not a discrepancy between the vignette and the results I get locally. I am able to re-produce the vignette results exactly. However, when I check table(pfit2@label) after the final line of the vignette, I find that all observations are assigned to a single mixture component, despite plot(pfit2, data = rituximab2) displaying two mixture components.

I've found the package to work quite well for the use cases you mentioned, however I'd like to try to incorporate prior information. Would you recommend against using any non-default prior at this time?

Thanks in advance!

@gfinak
Copy link
Member

gfinak commented Apr 15, 2023

I see. That sounds like a bug. It might be simple to resolve but it might not. I don't have a bioc dev environment available to me and I wouldn't be able to get to investigating it for some time.
The flowclust fit object also has a slot that holds the probability of each cell belonging to each component. The rowwise argmax of that can give you cell level assignments but it wouldn't account for outliers, like the label slot is supposed to I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants