Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Giving a user fasta file, but keeping all default fil path #1514

Open
Ist4lri opened this issue May 7, 2024 · 1 comment
Open

Giving a user fasta file, but keeping all default fil path #1514

Ist4lri opened this issue May 7, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Ist4lri
Copy link

Ist4lri commented May 7, 2024

Description of the bug

I provide a fasta file for running Mutect2 and have this error :

A USER ERROR has occurred: Fasta index file file://GRCh38_latest_genomic.fna.fai for reference file://GRCh38_latest_genomic.fna does not exist. Please see https://gatk.broadinstitute.org/hc/articles/360035531652-FASTA-Reference-genome-format for help creating it.

from Mutect2 of GATK.

But my file is here, and exist.

Command used and terminal output

`nextflow run nf-core/sarek -r dev -profile singularity -c custom.config -params-file nf-params.json`

json :

{
    "input": "sample.csv",
    "outdir": "results",
    "wes": "true",
    "fasta": "/gpfs/home/plgouttebel/home/exomic/data/ref/GRCh38_latest_genomic.fna",
    "aligner": "bwa-mem2",
    "tools": "mutect2",
    "skip_tools": "baserecalibrator,markduplicates"
}

config :
singularity.cacheDir = '/scratch/plgouttebel/data_Singula/nf-core-sarek_dev/singularity-images'

Output from Log file :

May-07 15:15:33.560 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
May-07 15:15:33.560 [Task submitter] INFO  nextflow.Session - [48/601cc5] Submitted process > NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:MUTECT2 (BR666F)
May-07 15:15:33.655 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F); work-dir=/scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/a0/84752509ea76ccd51c89f3b8af9c20
  error [nextflow.exception.ProcessFailedException]: Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F)` terminated with an error exit status (2)
May-07 15:15:33.763 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F)'

Caused by:
  Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F)` terminated with an error exit status (2)

Command executed:

  gatk --java-options "-Xmx9830M -XX:-UsePerfData" \
      GetPileupSummaries \
      --input BR666F.sorted.cram \
      --variant af-only-gnomad.hg38.vcf.gz \
      --output BR666F.mutect2.chr2_16146120-32867130.pileups.table \
      --reference GRCh38_latest_genomic.fna \
      --intervals chr2_16146120-32867130.bed \
      --tmp-dir . \


  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES":
      gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
  END_VERSIONS

Command exit status:
  2

Command output:
  (empty)

Command error:
  Using GATK jar /usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar
  Running:
      java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx9830M -XX:-UsePerfData -jar /usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar GetPileupSummaries --input BR666F.sorted.cram --variant af-only-gnomad.hg38.vcf.gz --output BR666F.mutect2.chr2_16146120-32867130.pileups.table --reference GRCh38_latest_genomic.fna --intervals chr2_16146120-32867130.bed --tmp-dir .
  13:15:32.947 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
  13:15:33.307 INFO  GetPileupSummaries - ------------------------------------------------------------
  13:15:33.313 INFO  GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.5.0.0
  13:15:33.314 INFO  GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
  13:15:33.314 INFO  GetPileupSummaries - Executing as plgouttebel@n064 on Linux v3.10.0-1160.el7.x86_64 amd64
  13:15:33.314 INFO  GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v17.0.10-internal+0-adhoc..src
  13:15:33.314 INFO  GetPileupSummaries - Start Date/Time: May 7, 2024 at 1:15:32 PM GMT
  13:15:33.314 INFO  GetPileupSummaries - ------------------------------------------------------------
  13:15:33.315 INFO  GetPileupSummaries - ------------------------------------------------------------
  13:15:33.316 INFO  GetPileupSummaries - HTSJDK Version: 4.1.0
  13:15:33.316 INFO  GetPileupSummaries - Picard Version: 3.1.1
  13:15:33.316 INFO  GetPileupSummaries - Built for Spark Version: 3.5.0
  13:15:33.317 INFO  GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
  13:15:33.317 INFO  GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
  13:15:33.317 INFO  GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
  13:15:33.317 INFO  GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
  13:15:33.318 INFO  GetPileupSummaries - Deflater: IntelDeflater
  13:15:33.318 INFO  GetPileupSummaries - Inflater: IntelInflater
  13:15:33.318 INFO  GetPileupSummaries - GCS max retries/reopens: 20
  13:15:33.318 INFO  GetPileupSummaries - Requester pays: disabled
  13:15:33.319 INFO  GetPileupSummaries - Initializing engine
  13:15:33.322 INFO  GetPileupSummaries - Shutting down engine
  [May 7, 2024 at 1:15:33 PM GMT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.01 minutes.
  Runtime.totalMemory()=167772160
  ***********************************************************************

  A USER ERROR has occurred: Fasta index file file://GRCh38_latest_genomic.fna.fai for reference file://GRCh38_latest_genomic.fna does not exist. Please see https://gatk.broadinstitute.org/hc/articles/360035531652-FASTA-Reference-genome-format for help creating it.

  ***********************************************************************
  Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

Work dir:
  /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/a0/84752509ea76ccd51c89f3b8af9c20

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
May-07 15:15:33.769 [Task monitor] INFO  nextflow.Session - Execution cancelled -- Finishing pending tasks before exit
May-07 15:15:33.795 [main] DEBUG nextflow.Session - Session await > all processes finished

Relevant files

[plgouttebel@login01 nf-core-sarek_dev]$ ls -l /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/a0/84752509ea76ccd51c89f3b8af9c20
total 4
lrwxrwxrwx 1 plgouttebel ubx2 160 May  7 15:15 af-only-gnomad.hg38.vcf.gz -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/3c/e686ef595583a185a5b7f2480f6f94/af-only-gnomad.hg38.vcf.gz
lrwxrwxrwx 1 plgouttebel ubx2 164 May  7 15:15 af-only-gnomad.hg38.vcf.gz.tbi -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/e9/bc174e86314d14b42fab79c5283b02/af-only-gnomad.hg38.vcf.gz.tbi
lrwxrwxrwx 1 plgouttebel ubx2 109 May  7 15:15 BR666F.sorted.cram -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/1a/5ad95b654c06311dc198df39b7a33d/BR666F.sorted.cram
lrwxrwxrwx 1 plgouttebel ubx2 114 May  7 15:15 BR666F.sorted.cram.crai -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/1a/5ad95b654c06311dc198df39b7a33d/BR666F.sorted.cram.crai
lrwxrwxrwx 1 plgouttebel ubx2 117 May  7 15:15 chr2_16146120-32867130.bed -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/b0/fead63adc11db1d9353e4e666e6bf9/chr2_16146120-32867130.bed
lrwxrwxrwx 1 plgouttebel ubx2  69 May  7 15:15 GRCh38_latest_genomic.fna -> /gpfs/home/plgouttebel/home/exomic/data/ref/GRCh38_latest_genomic.fna
lrwxrwxrwx 1 plgouttebel ubx2 162 May  7 15:15 Homo_sapiens_assembly38.dict -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/0f/674a437a17df7ac9f50ac6d50c930c/Homo_sapiens_assembly38.dict
lrwxrwxrwx 1 plgouttebel ubx2 167 May  7 15:15 Homo_sapiens_assembly38.fasta.fai -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/01/63bf12053a02deb319a2f6ac4dbe47/Homo_sapiens_assembly38.fasta.fai

System information

HPC on curta from MCIA (Mésocentre de calcul intensif aquitain)
sarek downloaded locally

@Ist4lri Ist4lri added the bug Something isn't working label May 7, 2024
@maxulysse
Copy link
Member

So from what I can see, issue is that null should have been assigned to genome.
But in my opinion, sarek should have either failed early.
Or print a huge warning and recompute the basic index from the fasta file:
fai, dict + needed build index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants