Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency on Background Cells (followup) #4

Open
NoahMottelson opened this issue Nov 15, 2021 · 3 comments
Open

Dependency on Background Cells (followup) #4

NoahMottelson opened this issue Nov 15, 2021 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@NoahMottelson
Copy link

Hey again,

In the 7th discussion point of the manuscript you mention the possibility of choosing the control gene sets based on a different data set than the one you're analyzing. Is this implemented as an option in the software yet?

@martinjzhang
Copy link
Owner

Hi,

There is no implementation yet. But we are working on it. There will likely be an implementation in a month or two.

Best,
Martin

@NoahMottelson
Copy link
Author

Thanks for the super fast reply.

@martinjzhang
Copy link
Owner

Hi Noah,

I am following up on this issue.

We currently think it is a complicated issue. The main problem is that the target scRNA-seq data and the reference scRNA-seq data may have different modalities (e.g., collected using different technologies) and are not comparable. In this case, it is hard to make statements about false positive control of the method.

We did add a new option to adjust for cell group proportions. Specifically, it takes a set of cell group annotations and inversely weight cells by the cell group proportions (done implicitly within the computations of scDRS). This option may partially address the issue of the imbalanced data set. See the "adj_prop" option in https://martinjzhang.github.io/scDRS/reference_cli.html

It may take a while for us to finally add the option for using reference data sets. If it is urgent to your research project, a hacky way is to, after calling scdrs.preprocess, substitute the mean_var column in adata.uns["SCDRS_PARAM"]["GENE_STATS"] with your own 20*20 mean-variance gene bins computed from the reference data (categorical). Then, you should be able to call scdrs.score_cell, which finds control genes based on your definition of mean-variance gene bins.

Let me know if you have further questions.

Best,
Martin

@martinjzhang martinjzhang added the enhancement New feature or request label Aug 3, 2022
@martinjzhang martinjzhang self-assigned this Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants