Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to split the results of compareCluster? #677

Open
Ldec12 opened this issue Mar 21, 2024 · 1 comment
Open

How to split the results of compareCluster? #677

Ldec12 opened this issue Mar 21, 2024 · 1 comment

Comments

@Ldec12
Copy link

Ldec12 commented Mar 21, 2024

The geneList contains three groups. After running the following code GSEA_GO <- compareCluster(geneList, fun="GSEA",TERM2GENE=m_df, eps = 0, pvalueCutoff=0.2), how can I split GSEA_GO into three groups according to the original grouping?”
Thanks

@guidohooiveld
Copy link

guidohooiveld commented Mar 21, 2024

It is not clear to me what exactly you try to achieve!

If you would like to create a dotplot grouped according to the 3 input lists, you should use the argument split=".sign" together with calling the function facet_grid as well; thus:

dotplot(GSEA_GO, showCategory=10, split=".sign") + facet_grid(.~.sign)

I agree that this is not well documented!

> library(clusterProfiler)
> library(enrichplot)
>  
> library(org.Hs.eg.db)
>  
> data(geneList, package="DOSE")
> inputList <- list(GeneList1 = geneList,
+                   GeneList2 = geneList,
+                   GeneList3 = rev(-1*geneList) )  # reverse order
> 
> ## compareCluster-GSEA
> xx <- compareCluster(geneClusters=inputList, fun = "gseGO",
+              OrgDb = org.Hs.eg.db, keyType = "ENTREZID",
+              ont = "BP", eps = 0, pvalueCutoff = 0.05,
+              pAdjustMethod = "none", minGSSize = 15, maxGSSize = 500)
> xx <- enrichplot::pairwise_termsim(xx) 
> xx <- setReadable(xx, 'org.Hs.eg.db', 'ENTREZID')
> 
> 
> p <- dotplot(xx, font.size=8, showCategory=8, title =("GSEA results"), split=".sign") + facet_grid(.~.sign)
> print(p)
> 

image

If you would like to split the output (numbers) as such, you can use the function split on the column named Cluster. The result will be a list in with separate results for all 3 groups in a slot. This list can then easily be exported to for example Excel, using the function saveWorkbook from the package openxlsx.

> out.list <- split(as.data.frame(xx), as.data.frame(xx)$Cluster)
> str(out.list)
List of 3
 $ GeneList1:'data.frame':      1281 obs. of  12 variables:
  ..$ Cluster        : Factor w/ 3 levels "GeneList1","GeneList2",..: 1 1 1 1 1 1 1 1 1 1 ...
  ..$ ID             : chr [1:1281] "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
  ..$ Description    : chr [1:1281] "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
  ..$ setSize        : int [1:1281] 316 470 236 184 325 151 224 360 487 440 ...
  ..$ enrichmentScore: num [1:1281] 0.588 0.519 0.631 0.654 0.541 ...
  ..$ NES            : num [1:1281] 2.79 2.56 2.88 2.88 2.56 ...
  ..$ pvalue         : num [1:1281] 2.37e-31 1.04e-30 2.46e-29 1.87e-26 2.52e-24 ...
  ..$ p.adjust       : num [1:1281] 2.37e-31 1.04e-30 2.46e-29 1.87e-26 2.52e-24 ...
  ..$ qvalue         : num [1:1281] 8.26e-28 1.81e-27 2.85e-26 1.62e-23 1.75e-21 ...
  ..$ rank           : num [1:1281] 449 1374 449 532 1246 ...
  ..$ leading_edge   : chr [1:1281] "tags=20%, list=4%, signal=20%" "tags=24%, list=11%, signal=22%" "tags=22%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
  ..$ core_enrichment: chr [1:1281] "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| __truncated__ "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1/MAD2"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF1"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT1/BIR"| __truncated__ ...
 $ GeneList2:'data.frame':      1289 obs. of  12 variables:
  ..$ Cluster        : Factor w/ 3 levels "GeneList1","GeneList2",..: 2 2 2 2 2 2 2 2 2 2 ...
  ..$ ID             : chr [1:1289] "GO:0051276" "GO:0007059" "GO:0098813" "GO:0000819" ...
  ..$ Description    : chr [1:1289] "chromosome organization" "chromosome segregation" "nuclear chromosome segregation" "sister chromatid segregation" ...
  ..$ setSize        : int [1:1289] 470 316 236 184 325 151 224 360 487 440 ...
  ..$ enrichmentScore: num [1:1289] 0.519 0.588 0.631 0.654 0.541 ...
  ..$ NES            : num [1:1289] 2.55 2.77 2.89 2.89 2.56 ...
  ..$ pvalue         : num [1:1289] 9.43e-32 9.11e-31 4.40e-29 1.61e-26 1.12e-24 ...
  ..$ p.adjust       : num [1:1289] 9.43e-32 9.11e-31 4.40e-29 1.61e-26 1.12e-24 ...
  ..$ qvalue         : num [1:1289] 3.27e-28 1.58e-27 5.09e-26 1.39e-23 7.79e-22 ...
  ..$ rank           : num [1:1289] 1374 449 449 532 1246 ...
  ..$ leading_edge   : chr [1:1289] "tags=24%, list=11%, signal=22%" "tags=20%, list=4%, signal=20%" "tags=22%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
  ..$ core_enrichment: chr [1:1289] "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1/MAD2"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF1"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT1/BIR"| __truncated__ ...
 $ GeneList3:'data.frame':      1318 obs. of  12 variables:
  ..$ Cluster        : Factor w/ 3 levels "GeneList1","GeneList2",..: 3 3 3 3 3 3 3 3 3 3 ...
  ..$ ID             : chr [1:1318] "GO:0051276" "GO:0007059" "GO:0098813" "GO:0000819" ...
  ..$ Description    : chr [1:1318] "chromosome organization" "chromosome segregation" "nuclear chromosome segregation" "sister chromatid segregation" ...
  ..$ setSize        : int [1:1318] 470 316 236 184 325 151 360 224 487 104 ...
  ..$ enrichmentScore: num [1:1318] -0.519 -0.588 -0.631 -0.654 -0.541 ...
  ..$ NES            : num [1:1318] -2.53 -2.79 -2.89 -2.9 -2.57 ...
  ..$ pvalue         : num [1:1318] 1.12e-30 1.97e-30 1.55e-29 6.89e-26 1.64e-24 ...
  ..$ p.adjust       : num [1:1318] 1.12e-30 1.97e-30 1.55e-29 6.89e-26 1.64e-24 ...
  ..$ qvalue         : num [1:1318] 3.40e-27 3.40e-27 1.78e-26 5.92e-23 9.75e-22 ...
  ..$ rank           : num [1:1318] 1375 450 450 533 1247 ...
  ..$ leading_edge   : chr [1:1318] "tags=24%, list=11%, signal=22%" "tags=27%, list=4%, signal=27%" "tags=22%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
  ..$ core_enrichment: chr [1:1318] "MIS18A/NBN/UCHL5/SMC2/SPDL1/PRKCQ/NASP/RFC3/TUBG1/RMDN1/PSRC1/ERCC4/CHEK2/RAD21/BRIP1/NUP155/MCM3/H1-5/CCT2/SMC"| __truncated__ "MIS18A/SRPK1/SMC2/SPDL1/CIAO2B/TUBG1/RMDN1/PSRC1/TUBB/CHEK2/RAD21/BRIP1/PLSCR1/RCC1/SMC4/SLC25A5/NCAPD3/FIRRM/K"| __truncated__ "ZWILCH/FBXO5/CENPF/BUB1/NCAPD2/CCNE2/CCNE1/ESPL1/CENPI/ECT2/SPAG5/SPC25/ZWINT/BUB1B/PTTG3P/RACGAP1/PLK1/CDC6/KI"| __truncated__ "NCAPG2/ZWILCH/FBXO5/CENPF/BUB1/NCAPD2/ESPL1/CENPI/SPAG5/SPC25/ZWINT/BUB1B/RACGAP1/PLK1/CDC6/KIF2C/KIF14/KIF4A/C"| __truncated__ ...
> 
> 
> library(openxlsx)
> 
> wb <- createWorkbook()
> Map(function(data, nameofsheet){     
+     addWorksheet(wb, nameofsheet)
+     writeData(wb, nameofsheet, data)},
+     out.list, names(out.list) )
$GeneList1
[1] 0

$GeneList2
[1] 0

$GeneList3
[1] 0

> saveWorkbook(wb, file = "all.compareCluster.results.GOBP.xlsx", overwrite = TRUE)
> 
> 

all.compareCluster.results.GOBP.xlsx

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants