Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error creating symlinks to BWAIndex #1492

Open
nevinwu opened this issue Apr 30, 2024 · 0 comments
Open

Error creating symlinks to BWAIndex #1492

nevinwu opened this issue Apr 30, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@nevinwu
Copy link

nevinwu commented Apr 30, 2024

Description of the bug

An execution failed in the FASTQ_CREATE_UMI_CONSENSUS_FGBIO:ALIGN_UMI:BWAMEM1_MEM process since symlinks to BWAIndex where pointing to a different path from the path where this index was actually downloaded.

Symlinks in the workDir where pointing to: /work/stage/8f/b099e803ad6802621e8d1e1fdd38c7/BWAIndex
And this actually downloaded in: /work/stage/a5/bb9f3e3f91928e18c24b2e256655e6/

image

It has happened twice on this HPC system. Relaunching the execution with -resume seems to solve the problem.

Issue #340 might be related??

Command used and terminal output

command:
nextflow run nf-core/sarek -r 3.1.2 -profile singularity -name tanda_completa -params-file /mnt/zonahpc/home/bioinformatica/comun/pruebas_borrar/test_sarek3/conf/nf-local-params_completa.yaml -c /mnt/zonahpc/home/bioinformatica/comun/pruebas_borrar/test_sarek3/conf/nextflow_fran.config

error output:
Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'NFCORE_SAREK:SAREK:FASTQ_CREATE_UMI_CONSENSUS_FGBIO:ALIGN_UMI:BWAMEM1_MEM (113-ffpe1-1)'

Caused by:
  Process `NFCORE_SAREK:SAREK:FASTQ_CREATE_UMI_CONSENSUS_FGBIO:ALIGN_UMI:BWAMEM1_MEM (113-ffpe1-1)` terminated with an error exit status (1)

Command executed:

  INDEX=`find -L ./ -name "*.amb" | sed 's/.amb//'`
  
  bwa mem \
      -K 100000000 -p -C -Y -R "@RG\tID:HVTWYDSX3.113-ffpe1.1\tPU:1\tSM:113_113-ffpe1\tLB:113-ffpe1\tDS:s3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta\tPL:ILLUMINA" \
      -t 16 \
      $INDEX \
      113-ffpe1-1_interleaved.fq.gz \
      | samtools view -bS --threads 16 -o 113-ffpe1-1.umi_unsorted.bam -
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SAREK:SAREK:FASTQ_CREATE_UMI_CONSENSUS_FGBIO:ALIGN_UMI:BWAMEM1_MEM":
      bwa: $(echo $(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*$//')
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
         -y INT        seed occurrence for the 3rd round seeding [20]
         -c INT        skip seeds with more than INT occurrences [500]
         -D FLOAT      drop chains shorter than FLOAT fraction of the longest overlapping chain [0.50]
         -W INT        discard a chain if seeded bases shorter than INT [0]
         -m INT        perform at most INT rounds of mate rescues for each read [50]
         -S            skip mate rescue
         -P            skip pairing; mate rescue performed unless -S also in use
  
  Scoring options:
  
         -A INT        score for a sequence match, which scales options -TdBOELU unless overridden [1]
         -B INT        penalty for a mismatch [4]
         -O INT[,INT]  gap open penalties for deletions and insertions [6,6]
         -E INT[,INT]  gap extension penalty; a gap of size k cost '{-O} + {-E}*k' [1,1]
         -L INT[,INT]  penalty for 5'- and 3'-end clipping [5,5]
         -U INT        penalty for an unpaired read pair [17]
  
         -x STR        read type. Setting -x changes multiple parameters unless overridden [null]
                       pacbio: -k17 -W40 -r10 -A1 -B1 -O1 -E1 -L0  (PacBio reads to ref)
                       ont2d: -k14 -W20 -r10 -A1 -B1 -O1 -E1 -L0  (Oxford Nanopore 2D-reads to ref)
                       intractg: -B9 -O16 -L5  (intra-species contigs to ref)
  
  Input/output options:
  
         -p            smart pairing (ignoring in2.fq)
         -R STR        read group header line such as '@RG\tID:foo\tSM:bar' [null]
         -H STR/FILE   insert STR to header if it starts with @; or insert lines in FILE [null]
         -o FILE       sam file to output results to [stdout]
         -j            treat ALT contigs as part of the primary assembly (i.e. ignore .alt file)
         -5            for split alignment, take the alignment with the smallest coordinate as primary
         -q            don't modify mapQ of supplementary alignments
         -K INT        process INT input bases in each batch regardless of nThreads (for reproducibility) []
  
         -v INT        verbosity level: 1=error, 2=warning, 3=message, 4+=debugging [3]
         -T INT        minimum score to output [30]
         -h INT[,INT]  if there are 80% of the max score, output all in XA [5,200]
         -a            output all alignments for SE or unpaired PE
         -C            append FASTA/FASTQ comment to SAM output
         -V            output the reference FASTA header in the XR tag
         -Y            use soft clipping for supplementary alignments
         -M            mark shorter split hits as secondary
  
         -I FLOAT[,FLOAT[,INT[,INT]]]
                       specify the mean, standard deviation (10% of the mean if absent), max
                       (4 sigma from the mean if absent) and min of the insert size distribution.
                       FR orientation only. [inferred]
  
  Note: Please read the man page for detailed description of the command line and options.
  
  [main_samview] fail to read the header from "-".

Work dir:
  /mnt/zonahpc/home/bioinformatica/comun/pruebas_borrar/test_sarek3/log_completa/work/64/9eb7698aa6f87ca16ef989cd20792f

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Relevant files

config file:
// Se explicita executor y cola para todos los procesos
process {
executor = "slurm"
queue = "cpu"
maxRetries = 2

// Aumenta el tiempo de ejecución para procesos con label process_low
withLabel:process_low {
time = "20h"
}

// Aumenta el tiempo de ejecución para procesos con label process_medium
withLabel:process_medium {
time = "12h"
}

// Aumenta la memoria para gatk baserecalibrator
withName: GATK4_BASERECALIBRATOR {
memory = "8GB"
}

// Aumenta la memoria para gatk applybqsr
withName: GATK4_APPLYBQSR {
memory = "12GB"
}

// Aumenta la memoria para samtools collatefastq
withName: SAMTOOLS_COLLATEFASTQ {
memory = "18GB"
}

// Aumenta la memoria para fgbio callmolecularconsesus
withName: FGBIO_CALLMOLECULARCONSENSUSREADS {
memory = "90GB"
}

// Aumenta la memoria para fgbio group reads by umi
withName: FGBIO_GROUPREADSBYUMI {
memory = "100GB"
}
}

// Caché de singularity
singularity.cacheDir = "/mnt/zonahpc/home/bioinformatica/comun/pruebas_borrar/test_sarek3/singularity_cacheDir"

params file:
input: /mnt/zonahpc/home/bioinformatica/comun/pruebas_borrar/test_sarek3/conf/samplesheet_completa.csv
outdir: /mnt/zonahpc/home/bioinformatica/comun/pruebas_borrar/test_sarek3/results_completa
genome: GATK.GRCh38
wes: true
intervals: /mnt/zonahpc/home/bioinformatica/comun/panel_designs/HyperExome/target.bed
trim_fastq: true
umi_read_structure: 9M151T 151T
vep_include_fasta: true

System information

Nextflow version 22.10.1.5828
Hardware: HPC
Executor: slurm
Container engine: Singularity

@nevinwu nevinwu added the bug Something isn't working label Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant