Huge 10X scRNA-seq mouse data #70

alexfernandes8a · 2022-07-20T20:17:44Z

Hi! I have a huge 10X scRNA-seq mouse data (~60Gb BAM file | ~50K cells from 12 mice) that I am trying to run on cellSNP-lite. I compiled cellSNP-lite in an HPC environment and I am running it from there on the mode 2A.
The problem is, no matter how much RAM I am using, I am constantly getting the message "Combined max depth is above 1M. Potential memory hog!" and it has been running for 11 days already.
I know it is a lot of data and I am wondering what would be the best approach in that scenario? Perhaps split the cell barcodes file?
Any help is highly appreciated!
Thank you so very much.

hxj5 · 2022-07-21T02:02:00Z

Hi, Mode 2a is more suitable for small datasets. For large datasets, you may try Mode 2b + Mode 1a. Mode 2a does joint calling and genotyping, but it is substantially slower than calling first in a bulk manner by Mode 2b followed by genotyping in Mode 1a. To speed up, you may try --minMAF 0.1 --minCOUNT 100 options in both modes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge 10X scRNA-seq mouse data #70

Huge 10X scRNA-seq mouse data #70

alexfernandes8a commented Jul 20, 2022

hxj5 commented Jul 21, 2022

Huge 10X scRNA-seq mouse data #70

Huge 10X scRNA-seq mouse data #70

Comments

alexfernandes8a commented Jul 20, 2022

hxj5 commented Jul 21, 2022