Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF analysis between conditions scRNA #118

Closed
kiwipeel opened this issue Mar 16, 2024 · 9 comments
Closed

TF analysis between conditions scRNA #118

kiwipeel opened this issue Mar 16, 2024 · 9 comments
Assignees
Labels
question Further information is requested

Comments

@kiwipeel
Copy link

How can I determine if the TF activity between two conditions in my single cell dataset is significant for a gene?

@PauBadiaM PauBadiaM self-assigned this Mar 18, 2024
@PauBadiaM PauBadiaM added the question Further information is requested label Mar 18, 2024
@PauBadiaM
Copy link
Collaborator

Hi @kiwipeel ,

You can perform differential expression analysis at the pseudobulk levels between conditions and then use the obtained contrast level gene statistics as input for decoupler. You have an example of this workflow in this vignette. It is in python but should be relatively easy to reproduce in R if that is a limitation. Hope this is helpful!

@kiwipeel
Copy link
Author

Hi @kiwipeel ,

You can perform differential expression analysis at the pseudobulk levels between conditions and then use the obtained contrast level gene statistics as input for decoupler. You have an example of this workflow in this vignette. It is in python but should be relatively easy to reproduce in R if that is a limitation. Hope this is helpful!

Why can't I just apply statistical tests to the score values generated from the ULM model? Thank you in advance

@PauBadiaM
Copy link
Collaborator

Hi @kiwipeel ,

You could also do that, but if the objective is to compare conditions I would recommend to go the pseudobulk route since with it you do not overinflate the p-values by considering single-cells as true replicates (which are not).

@kiwipeel
Copy link
Author

Hi @kiwipeel ,

You could also do that, but if the objective is to compare conditions I would recommend to go the pseudobulk route since with it you do not overinflate the p-values by considering single-cells as true replicates (which are not).

Thank you. Do the p-values in the run_ulm results represent the significance of the scores for each cell and transcription factor, am I right? Why do we create a new assay from all of these scores while there are scores that don't have significant p-values?

@PauBadiaM
Copy link
Collaborator

Hi @kiwipeel ,

Indeed! We keep all of them since p-value thresholding is completely arbitrary, depending on the application you might want to use a more strict or relax threshold.

@kiwipeel
Copy link
Author

Hi @kiwipeel ,

Indeed! We keep all of them since p-value thresholding is completely arbitrary, depending on the application you might want to use a more strict or relax threshold.

Thank you. However, if I put a threshold on the p-value, it implies that there will be missing values in the new assay we generate from the tf scores. What is the correct way to handle this?"

@PauBadiaM
Copy link
Collaborator

It really depends on the downstream task you want to use them for, in your case since you are interested in contrasting conditions I would again recommend to do it at the pseudobulk level, there the filtering by p-value is going to be easier to handle since you obtain a single vector of changes of activities that may or may not be significant.

@kiwipeel
Copy link
Author

It really depends on the downstream task you want to use them for, in your case since you are interested in contrasting conditions I would again recommend to do it at the pseudobulk level, there the filtering by p-value is going to be easier to handle since you obtain a single vector of changes of activities that may or may not be significant.

Thank you again. One last question.. After getting TF assay from model , should I use ScaleData() function on tf assay by using split.by argument based on my conditions ?

@PauBadiaM
Copy link
Collaborator

Hi @kiwipeel , if it is just for plotting yes, I am not sure about the split.by argument though, you would want to see the differences between your conditions instead no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants