Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

same sample name over multiple patient doest not fail input schema validation #1503

Open
maxulysse opened this issue May 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@maxulysse
Copy link
Member

maxulysse commented May 6, 2024

Description of the bug

This is the output that is seen on the terminal once the pipeline has failed after GATK4_MARKDUPLICATES, my guess is that one of the later join operator is causing the subsequent failure:

Detected join operation duplicate emission on left channel -- offending element: key=[patient:test2, sample:test, sex:XX, status:0, n_fastq:1, data_type:bam, id:test]; value=/home/max/workspace/sarek/work/fe/2e8890cae572ee686c7475edd6e895/test.md.cram

We should really fail early for that.

Issue reported by Ist4lri

Command used and terminal output

No response

Relevant files

No response

System information

No response

@maxulysse maxulysse added the bug Something isn't working label May 6, 2024
@Ist4lri
Copy link

Ist4lri commented May 7, 2024

  1. Command used and terminal output :
nextflow run nf-core/sarek -r dev -profile singularity -c custom.config -params-file nf-params.json
Error : Detected join operation duplicate emission on left channel -- offending element: key=[patient:test2, sample:test, sex:XX, status:0, n_fastq:1, data_type:bam, id:test]; value=/home/max/workspace/sarek/work/fe/2e8890cae572ee686c7475edd6e895/test.md.cram
  1. Relevant files :

With this sample :

patient,sample,lane,fastq_1,fastq_2,status
BR664F,liver,1,/path/to/the/file/BR664F_R1.fastq.gz,/path/to/the/file/BR664F_R2.fastq.gz,1
BR665F,liver,1,/path/to/the/file/BR665F_R1.fastq.gz,/path/to/the/file/BR665F_R2.fastq.gz,1
BR666F,liver,1,/path/to/the/file/BR666F_R1.fastq.gz,/path/to/the/file/BR666F_R2.fastq.gz,1
BR667F,liver,1,/path/to/the/file/BR667F_R1.fastq.gz,/path/to/the/file/BR667F_R2.fastq.gz,1
BR668F,liver,1,/path/to/the/file/BR668F_R1.fastq.gz,/path/to/the/file/BR668F_R2.fastq.gz,1
BR669F,liver,1,/path/to/the/file/BR669F_R1.fastq.gz,/path/to/the/file/BR669F_R2.fastq.gz,1
BR670F,liver,1,/path/to/the/file/BR670F_R1.fastq.gz,/path/to/the/file/BR670F_R2.fastq.gz,1
BR671F,liver,1,/path/to/the/file/BR671F_R1.fastq.gz,/path/to/the/file/BR671F_R2.fastq.gz,1
{
    "input": "sample.csv",
    "outdir": "results",
    "wes": "true",
    "fasta": "/path/to/this/file/GRCh38_latest_genomic.fna",
    "aligner": "bwa-mem2",
    "skip_tools": "baserecalibrator,markduplicates"
}
  1. System Information

HPC Curta on MCIA (Mésocentre calcul intensif aquitain)
I downloaded sarek on local files in cluster, because there is no profile on this cluster (not the same than IFB.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants