Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STARsolo Workflow Doesn't Output Solo.out counts #187

Open
sean-at-tessera opened this issue Dec 8, 2022 · 2 comments
Open

STARsolo Workflow Doesn't Output Solo.out counts #187

sean-at-tessera opened this issue Dec 8, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@sean-at-tessera
Copy link

Description of the bug

The STARsolo workflow completes successfully and produces many of the expected files (the .Log files, the *Aligned.sortedByCoord.out.bam files, and the .out.tab files), however it doesn't produce the expected Solo.out directory and the corresponding count matrices. There doesn't appear to be any count outputs, as far as I can tell.

Command used and terminal output

nextflow run 'https://github.com/nf-core/scrnaseq' \
		 -name jolly_avogadro \
		 -params-file [redacted] \
		 -with-tower [redacted] \
		 -r 2.0.0 \
		 -resume 0da894f0-c219-4ccf-946c-3d2eee8acecc

The final parameters (other than `genomes`) were: 

bustools_correct = true
custom_config_base = https://raw.githubusercontent.com/nf-core/configs/master
multiqc_title = [redacted]
plaintext_email = false
monochrome_logs = false
aligner = star
max_cpus = 16
custom_config_version = master
max_memory = 128.GB
skip_multiqc = false
protocol = 10XV3
kb_workflow = standard
gtf = s3://[redacted]/resources/10x_genomics/refdata-gex-GRCh38-2020-A/genes/genes.gtf
max_multiqc_email_size = 25.MB
max_time = 240.h
schema_ignore_params = genomes
tracedir = ${params.outdir}/pipeline_info
validate_params = true
skip_bustools = false
igenomes_ignore = false
outdir = s3://[redacted]
publish_dir_mode = copy
input = s3://[redacted]
help = false
igenomes_base = s3://ngi-igenomes/igenomes
genome_fasta = s3://[redacted]
show_hidden_params = false
enable_conda = false

Relevant files

No response

System information

  • Nextflow version 22.06.1.edge build 5712
  • Run on cloud
  • Executor: awsbatch
  • Version of nf-core/scrnaseq: 2.1.0
@sean-at-tessera sean-at-tessera added the bug Something isn't working label Dec 8, 2022
@grst
Copy link
Member

grst commented Feb 21, 2023

There are unified count matrices downstream of the alignment process (such that it's the same for all aligners). Since #160 got merged also in MTX format (currently only in the dev version).

Would that already be enough for your use-case? Or are you looking for any specific output file?

It probably wouldn't harm to have the entire Solo.out folder in the output. And we anyway need to do some work on restructuring the output directory (see also e.g. #178).

@sean-at-tessera
Copy link
Author

@grst I will try to re-run my dataset using the dev branch and try to use the unified count matrices. I do think it'd be a nice option to output the default Solo.out output directory for those who might want it.

An important gotcha on that front is what happens when this pipeline is run e.g. on AWS Batch with Docker. The output directory will be written as root and won't be able to be copied to cloud storage upon pipeline completion. This is something my team identified when writing a homespun single-cell Nextflow pipeline. If this functionality is incorporated into nf-core/scrnaseq, I'd recommend setting chmod 777 on Solo.out upon completion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants