Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background gene list for GSEA #8825

Open
zincfingers89 opened this issue Apr 25, 2024 · 2 comments
Open

Background gene list for GSEA #8825

zincfingers89 opened this issue Apr 25, 2024 · 2 comments

Comments

@zincfingers89
Copy link

I have a seurat object comprising of my cells of interest in two different conditions – healthy and disease. I want to perfom Gene set enrichment analysis on the differentially expressed genes (DEGs) in one of my clusters.

However I want an appropriate background gene list to compare my DEGs. Using the whole genome isn’t relevant as many genes wont ever be expressed by my cell of interest therefore of course they are going to be underrepresented in my DEGs.
 
Wondered if I can make a “background gene list specific to my cell of interest” from my data i.e. I would like to pull out a list of all the genes expressed > 1 in my total Seurat object.
 
Is there an elegant way to do this?
 
<rownames(object)>
 
gives me a huge list of genes many of which are never expressed in my object
 
How do I obtain a list of genes that are expressed at least once?

@neanderthalensis
Copy link

neanderthalensis commented Apr 29, 2024

There is probably a better way to do this but this approach will tell you which genes are expressed in at least one cell in your subset.

dat <- subset(dat, subset = your_category == "your_value") #change to the name of your object and your subset rownames(dat@assays$RNA@counts)[rowSums(dat@assays$RNA@counts) > 1]#change dat to the name of your object, RNA to the name of your assay

dat@assays$RNA@counts returns the count matrix, rowSums takes the matrix and returns a vector of the sum of counts for each gene across all cells in the matrix (ie. the sum of each row as the name suggests), the >1 converts this vector to a logical vector which we then use to subset the rownames() vector.

@zincfingers89
Copy link
Author

Thanks! I appreciate it, I'll give it a bash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants