Input structure of Gene Set Enrichment Analysis #10

sajjad6al · 2016-11-10T02:36:29Z

I'm trying to write a shiny app using this package, however, I am unable to identify the input structure required to run GSEA. More specifically, I need to know what is the first argument for gsePathway(). I am aware it must be an order rank geneList, but how would this translate into a sample .csv file to be used as input file?

For example, I'm using a list of genes (EntrezID) all in one column, written as .csv file to run enrichPathway() and that works perfectly. How would you prepare an input file for gsePathway()?

As a side note: I am able to run the analysis using the sample dataset embedded in the package.

Best regards,
Sajjad Abedian
New York City College of Technology

GuangchuangYu · 2016-11-10T02:54:57Z

see https://github.com/GuangchuangYu/DOSE/wiki/how-to-prepare-your-own-geneList.

Please let me know if you finished your shiny app, I can add a link in ReactomePA homepage.

sajjad6al · 2017-01-04T23:27:25Z

Dear Guangchuang Yu,

I've developed an app using the package and modified it based on what was needed at the time. The main modification is that I have enabled the users to input gene symbols (instead of EntrezID) when running pathway enrichment analysis or gene set enrichment analysis. I will include the sample input file for each as well to test the app.

I have a couple of questions regarding using the package.

My main problem is, when implementing the package in a shiny app, the network figures will be practically useless if the user chooses to view too many categories of pathways. I understand when it is ran locally I am able to set the parameter "fix" as false and move them around on my computer. But the shiny app doesn't let me do as such online. What is your suggestion on over-crowded networks?
When the user downloads GSEA plots generated in Gene Set Enrichment Analysis as png, only one portion of the two plots will be saved. Whereas, when it is downloaded as pdf, both of them will be saved. I'm trying to understand how those plots are generated within the package, and how can I solve this problem.

Please use the input files to test all the functionalities of the app, and let me know about those two problems, and I would love to hear your general idea about how to make the app better.

https://sajjadabedian.shinyapps.io/ReactomePA/

Input files.zip

GuangchuangYu · 2017-01-05T02:51:20Z

for Q1, you may refer to YuLab-SMU/DOSE#12. I don't have time to develop D3Network version of these plots, but it can be done.

for Q2, I have no idea since I don't know how you implement that functionality.

The following code works for me in R console.

> require(ReactomePA)
> data(geneList)
> x = gsePathway(geneList)
> png("1474244.png")
> gseaplot(x, 1, title=x$Description[1])
> dev.off()

sajjad6al · 2017-01-05T15:51:20Z

Thank you so much for your fast response, I will definitely change up the codes accordingly and will update you on the progress.

aamarnani · 2017-03-19T17:06:59Z

Hello GuangChuangYu and Sajjad6al,

Thank you so much for putting together ReactomePA (GCY) and for making it into a very useful ShinyApp that is immensely user friendly (Saj).

I have been using Kallisto and Sleuth for RNA Sequencing analysis and it has been useful to move from the Sleuth output to your tools for pathway enrichment analysis. One note about doing so in case it is helpful for anyone:

When going from Sleuth analysis to the ReactomePA ShinyApp, one needs to use gene symbols. However, when annotating transcript results directly from biomaRt, Uniprot gene ID and other annotations don't work. Instead, the solution became to annotate the transcript Ids with the "external_gene_name" from bioMart and then in excel, use the "UPPER()" function to turn the gene ids form Ext_gene_name into all caps.

The gene IDs need to be in all CAPS for them to work in the ReactomePA Shiny App.

Hopefully this is helpful for others trying to do something similar!

Kind Regards,

Abhi
MD/PhD Student
SUNY Downstate Medical Center
PS: Here is the code that I use when annotating the files while using the library("sleuth") package in R that I found most useful for downstream pathway analysis and other analyses.

source("http://bioconductor.org/biocLite.R")
biocLite("biomaRt")
require(biomaRt)

mart <- biomaRt::useMart(biomart = "ENSEMBL_MART_ENSEMBL",
dataset = "mmusculus_gene_ensembl",
host = 'ensembl.org')
listAttributes(mart)
t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id", "ensembl_gene_id",
"external_gene_name","chromosome_name", "entrezgene",
"ucsc"), mart = mart)
t2g <- dplyr::rename(t2g, target_id = ensembl_transcript_id,
ens_gene = ensembl_gene_id, ext_gene = external_gene_name, Chrom_name = chromosome_name, entrezgene = entrezgene, ucsc = ucsc)

Diango700 · 2020-04-23T08:30:06Z

Hello
I tried to create my geneList according to this r code:

setwd("C:/cygwin64/home/DIANGO/EXCELL/")
d = read.csv("PA_down_id.csv",sep = " ", header = F)
head(d)

output :`

	ID	FLC
1	PF3D7_0936800	-7.897314
2	PF3D7_1478900	-1.709372
3	PF3D7_1009700	-1.255239
4	PF3D7_0508500	-1.137078
5	PF3D7_1458700	-1.368088
6	PF3D7_1124600	-1.259540

geneList =d[,2]
names(geneList) = as.character(d[,1])
geneList = sort(geneList, decreasing = TRUE)
head(geneList)

library(ReactomePA)
data(geneList)
de <- names(geneList)[abs(geneList) > 1.5]
head(de)

oupout :
geneList dataset not found [1] "PF3D7_0500600" "PF3D7_0221400" "PF3D7_0937300" "PF3D7_1478900" "PF3D7_0413300" "PF3D7_0421500"

x <- enrichPathway(gene=de,pvalueCutoff=0.05, readable=T)
head(as.data.frame(x))

output:

--> No gene can be mapped....
--> Expected input gene ID: PF3D7_1405400,PF3D7_1123800,PF3D7_0725200,PF3D7_1439000,PF3D7_0716100,PF3D7_1421900
--> return NULL...

Help me to create my dataset. I'm working on plamoduim Falciparum and my geneIDs have been generated from the plasmodb.org database.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input structure of Gene Set Enrichment Analysis #10

Input structure of Gene Set Enrichment Analysis #10

sajjad6al commented Nov 10, 2016

GuangchuangYu commented Nov 10, 2016

sajjad6al commented Jan 4, 2017

GuangchuangYu commented Jan 5, 2017 •

edited

sajjad6al commented Jan 5, 2017

aamarnani commented Mar 19, 2017

Diango700 commented Apr 23, 2020

Input structure of Gene Set Enrichment Analysis #10

Input structure of Gene Set Enrichment Analysis #10

Comments

sajjad6al commented Nov 10, 2016

GuangchuangYu commented Nov 10, 2016

sajjad6al commented Jan 4, 2017

GuangchuangYu commented Jan 5, 2017 • edited

sajjad6al commented Jan 5, 2017

aamarnani commented Mar 19, 2017

Diango700 commented Apr 23, 2020

GuangchuangYu commented Jan 5, 2017 •

edited