Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detecting Pseudobulk DACRs with FindMarkers using DESeq2 #8784

Open
yls2g13 opened this issue Apr 18, 2024 · 3 comments
Open

Detecting Pseudobulk DACRs with FindMarkers using DESeq2 #8784

yls2g13 opened this issue Apr 18, 2024 · 3 comments

Comments

@yls2g13
Copy link

yls2g13 commented Apr 18, 2024

Currently analysing a 10X Multiome dataset. When I run FindMarkers() on an ATAC assay with custom peaks using DESeq2 I get an error back. Appreciate if someone here could teach me how to understand and troubleshoot this.

Code used:

# https://github.com/satijalab/seurat/discussions/7763
# add 1 so that DESeq2 will run
so@assays$customPeakList$counts <- as.matrix(so@assays$customPeakList$counts+1)

# FindMarkers()
markers <- FindMarkers(so, ident.1 = "Control", ident.2 = "Knockout",
                   group.by = "sampleSource",
                   assay = "customPeakList",
                   test.use = "DESeq2")

Resulting error message:

converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,  : 
  newsplit: out of vertex space
In addition: There were 17 warnings (use warnings() to see them)

sessionInfo:

R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /usr/lib64/libblas.so.3.4.2
LAPACK: /usr/lib64/liblapack.so.3.4.2

locale:
 [1] LC_CTYPE=en_AU.UTF-8          LC_NUMERIC=C                  LC_TIME=en_AU.UTF-8          
 [4] LC_COLLATE=en_AU.UTF-8        LC_MONETARY=en_AU.UTF-8       LC_MESSAGES=en_AU.UTF-8      
 [7] LC_PAPER=en_AU.UTF-8          LC_NAME=en_AU.UTF-8           LC_ADDRESS=en_AU.UTF-8       
[10] LC_TELEPHONE=en_AU.UTF-8      LC_MEASUREMENT=en_AU.UTF-8    LC_IDENTIFICATION=en_AU.UTF-8

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.38.3                      SummarizedExperiment_1.28.0        MatrixGenerics_1.10.0             
 [4] matrixStats_1.1.0                  SeuratDisk_0.0.0.9020              scCustomize_1.1.1                 
 [7] cowplot_1.1.3                      ggpubr_0.4.0                       xlsx_0.6.5                        
[10] RColorBrewer_1.1-3                 viridisLite_0.4.2                  circlize_0.4.15                   
[13] ComplexHeatmap_2.14.0              patchwork_1.1.3.9000               TFBSTools_1.34.0                  
[16] JASPAR2020_0.99.10                 BSgenome.Mmusculus.UCSC.mm10_1.4.3 BSgenome_1.64.0                   
[19] rtracklayer_1.56.1                 Biostrings_2.66.0                  XVector_0.38.0                    
[22] EnsDb.Mmusculus.v79_2.99.0         ensembldb_2.20.2                   AnnotationFilter_1.20.0           
[25] GenomicFeatures_1.48.4             AnnotationDbi_1.58.0               Biobase_2.58.0                    
[28] GenomicRanges_1.50.2               GenomeInfoDb_1.34.9                IRanges_2.32.0                    
[31] S4Vectors_0.36.2                   BiocGenerics_0.44.0                forcats_0.5.1                     
[34] stringr_1.5.1                      dplyr_1.1.4                        purrr_1.0.2                       
[37] readr_2.1.2                        tidyr_1.3.0                        tibble_3.2.1                      
[40] ggplot2_3.4.4                      tidyverse_1.3.1                    Signac_1.12.0                     
[43] Seurat_5.0.0                       SeuratObject_5.0.1                 sp_2.1-1                          

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3              ggprism_1.0.4               scattermore_1.2             R.methodsS3_1.8.2          
  [5] bit64_4.0.5                 knitr_1.45                  irlba_2.3.5.1               DelayedArray_0.24.0        
  [9] R.utils_2.12.2              data.table_1.14.8           KEGGREST_1.36.0             RCurl_1.98-1.13            
 [13] doParallel_1.0.17           generics_0.1.3              RSQLite_2.2.14              RANN_2.6.1                 
 [17] future_1.33.0               bit_4.0.4                   tzdb_0.3.0                  spatstat.data_3.0-3        
 [21] xml2_1.3.3                  lubridate_1.8.0             httpuv_1.6.12               DirichletMultinomial_1.38.0
 [25] xfun_0.41                   rJava_1.0-6                 hms_1.1.1                   evaluate_0.23              
 [29] promises_1.2.1              fansi_1.0.6                 restfulr_0.0.15             progress_1.2.2             
 [33] caTools_1.18.2              dbplyr_2.4.0                readxl_1.4.0                geneplotter_1.74.0         
 [37] igraph_1.5.1                DBI_1.1.3                   htmlwidgets_1.6.3           spatstat.geom_3.2-7        
 [41] paletteer_1.5.0             ellipsis_0.3.2              RSpectra_0.16-1             backports_1.4.1            
 [45] annotate_1.74.0             biomaRt_2.52.0              deldir_1.0-9                vctrs_0.6.5                
 [49] ROCR_1.0-11                 abind_1.4-5                 cachem_1.0.6                withr_2.5.0                
 [53] progressr_0.14.0            sctransform_0.4.1           GenomicAlignments_1.34.1    prettyunits_1.2.0          
 [57] goftest_1.2-3               cluster_2.1.3               dotCall64_1.1-0             lazyeval_0.2.2             
 [61] seqLogo_1.62.0              crayon_1.5.2                hdf5r_1.3.5                 spatstat.explore_3.2-5     
 [65] pkgconfig_2.0.3             vipor_0.4.5                 nlme_3.1-157                ProtGenerics_1.28.0        
 [69] rlang_1.1.3                 globals_0.16.2              lifecycle_1.0.4             miniUI_0.1.1.1             
 [73] filelock_1.0.2              fastDummies_1.7.3           BiocFileCache_2.6.1         modelr_0.1.8               
 [77] ggrastr_1.0.1               cellranger_1.1.0            polyclip_1.10-6             RcppHNSW_0.5.0             
 [81] lmtest_0.9-40               Matrix_1.6-3                carData_3.0-5               zoo_1.8-12                 
 [85] beeswarm_0.4.0              reprex_2.0.1                ggridges_0.5.4              GlobalOptions_0.1.2        
 [89] png_0.1-8                   rjson_0.2.21                bitops_1.0-7                R.oo_1.25.0                
 [93] KernSmooth_2.23-20          spam_2.10-0                 blob_1.2.3                  shape_1.4.6                
 [97] parallelly_1.36.0           spatstat.random_3.2-1       rstatix_0.7.0               ggsignif_0.6.3             
[101] CNEr_1.32.0                 scales_1.3.0                memoise_2.0.1               magrittr_2.0.3             
[105] plyr_1.8.9                  ica_1.0-3                   zlibbioc_1.44.0             compiler_4.2.0             
[109] BiocIO_1.6.0                clue_0.3-64                 fitdistrplus_1.1-11         snakecase_0.11.0           
[113] Rsamtools_2.14.0            cli_3.6.2                   listenv_0.9.0               pbapply_1.7-2              
[117] MASS_7.3-57                 tidyselect_1.2.0            stringi_1.8.1               yaml_2.3.7                 
[121] locfit_1.5-9.7              ggrepel_0.9.4               fastmatch_1.1-3             tools_4.2.0                
[125] future.apply_1.11.0         parallel_4.2.0              rstudioapi_0.15.0           TFMPvalue_0.0.8            
[129] foreach_1.5.2               janitor_2.2.0               gridExtra_2.3               Rtsne_0.16                 
[133] digest_0.6.33               shiny_1.8.0                 pracma_2.4.2                Rcpp_1.0.12                
[137] car_3.0-13                  broom_0.8.0                 later_1.3.1                 RcppAnnoy_0.0.21           
[141] httr_1.4.7                  colorspace_2.1-0            rvest_1.0.2                 XML_3.99-0.15              
[145] fs_1.5.2                    tensor_1.5                  reticulate_1.35.0           splines_4.2.0              
[149] uwot_0.1.16                 RcppRoll_0.3.0              rematch2_2.1.2              spatstat.utils_3.0-4       
[153] xlsxjars_0.6.1              plotly_4.10.3               xtable_1.8-4                jsonlite_1.8.7             
[157] poweRlaw_0.70.6             R6_2.5.1                    pillar_1.9.0                htmltools_0.5.7            
[161] mime_0.12                   glue_1.7.0                  fastmap_1.1.1               BiocParallel_1.32.6        
[165] codetools_0.2-18            utf8_1.2.4                  lattice_0.20-45             spatstat.sparse_3.0-3      
[169] ggbeeswarm_0.6.0            curl_5.1.0                  leiden_0.4.3.1              gtools_3.9.5               
[173] GO.db_3.15.0                survival_3.5-7              rmarkdown_2.25              munsell_0.5.0              
[177] GetoptLong_1.0.5            GenomeInfoDbData_1.2.9      iterators_1.0.14            haven_2.5.0                
[181] reshape2_1.4.4              gtable_0.3.4  
@morallawwithin
Copy link

You do NOT need to do this:

so@assays$customPeakList$counts <- as.matrix(so@assays$customPeakList$counts+1

You misunderstood the discussion at #7763 , this is only necassary if you want to run DESeq2 WITHOUT Seurat.

@yls2g13
Copy link
Author

yls2g13 commented Apr 22, 2024

I removed the step

so@assays$customPeakList$counts <- as.matrix(so@assays$customPeakList$counts+1

I now get this error message:
Does this mean my data is too sparse?

Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  : 
  every gene contains at least one zero, cannot compute log geometric means

@igrabski
Copy link
Contributor

Hi, this error occurs if every single feature contains at least one zero, so it does indicate sparsity. Can you describe more about your data here? How did you pseudobulk here? If you look at some entries of so@assays$customPeakList$counts, does it seem very sparse across the board or are there a couple of particular outlier samples causing the sparsity?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants