Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] (or user error) kraken2 db not found #492

Open
incoherentian opened this issue Mar 5, 2024 · 1 comment
Open

[bug] (or user error) kraken2 db not found #492

incoherentian opened this issue Mar 5, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@incoherentian
Copy link

Hi! Probably doing something wrong here, but left bug in title just-in-case :)

The Error

--------------------------------------------------------------------
ERROR ~ Required parameters are missing, please check: --kraken2_db

 -- Check '.nextflow.log' file for details
Required Parameters
  --bactopia                          [string]  The path to bactopia results to use as inputs
  --kraken2_db                        [string]  The a single tarball or path to a Kraken2 formatted database

--------------------------------------------------------------------
ERROR ~ ERROR: Validation of pipeline parameters failed!

The Execution

bactopia --wf kraken2 \
  -profile arcc_hawk \
  --cluster_opts '--account scw1234 --qos=maxjobs1500' \
  --max_cpus 8 \
  --bactopia /scratch/c.medib/bactopia/out_bactopia/d121_via_d83_COV_19122023JA1TO59_tg_c.medib_job56038718 \ 
  --kraken2_db /scratch/c.medib/.databases/k2_pluspf_08gb_20240112 \
  --max_retry 0 \
  --cleanup_workdir 

Databased exists:

[c.medib@cl2(hawk) .databases]$ tar -xvzf k2_pluspf_08gb_20240112.tar.gz -C /scratch/c.medib/.databases/k2_pluspf_08gb_20240112
hash.k2d
opts.k2d
taxo.k2d
seqid2taxid.map
inspect.txt
ktaxonomy.tsv
library_report.tsv
database100mers.kmer_distrib
database150mers.kmer_distrib
database200mers.kmer_distrib
database250mers.kmer_distrib
database300mers.kmer_distrib
database50mers.kmer_distrib
database75mers.kmer_distrib
unmapped_accessions.txt
bactopia file structure intact
/scratch/c.medib/bactopia/out_bactopia/d121_via_d83_COV_19122023JA1TO59_tg_c.medib_job56038718/COV419_tg
|-- main
|   |-- annotator
|   |   `-- prokka
|   |       |-- COV419_tg-blastdb.tar.gz
|   |       |-- COV419_tg.faa.gz
|   |       |-- COV419_tg.ffn.gz
|   |       |-- COV419_tg.fna.gz
|   |       |-- COV419_tg.fsa.gz
|   |       |-- COV419_tg.gbk.gz
|   |       |-- COV419_tg.gff.gz
|   |       |-- COV419_tg.sqn.gz
|   |       |-- COV419_tg.tbl.gz
|   |       |-- COV419_tg.tsv
|   |       |-- COV419_tg.txt
|   |       `-- logs
|   |           |-- COV419_tg.err
|   |           |-- COV419_tg.log
|   |           |-- nf-prokka.begin
|   |           |-- nf-prokka.err
|   |           |-- nf-prokka.log
|   |           |-- nf-prokka.out
|   |           |-- nf-prokka.run
|   |           |-- nf-prokka.sh
|   |           |-- nf-prokka.trace
|   |           `-- versions.yml
|   |-- assembler
|   |   |-- COV419_tg.fna.gz
|   |   |-- COV419_tg.tsv
|   |   |-- flash.hist
|   |   |-- flash.histogram
|   |   |-- logs
|   |   |   |-- nf-assembler.begin
|   |   |   |-- nf-assembler.err
|   |   |   |-- nf-assembler.log
|   |   |   |-- nf-assembler.out
|   |   |   |-- nf-assembler.run
|   |   |   |-- nf-assembler.sh
|   |   |   |-- nf-assembler.trace
|   |   |   |-- shovill.log
|   |   |   `-- versions.yml
|   |   |-- shovill.corrections
|   |   `-- spades-unpolished.gfa.gz
|   |-- gather
|   |   |-- COV419_tg-meta.tsv
|   |   |-- logs
|   |   |   |-- nf-gather.begin
|   |   |   |-- nf-gather.err
|   |   |   |-- nf-gather.log
|   |   |   |-- nf-gather.out
|   |   |   |-- nf-gather.run
|   |   |   |-- nf-gather.sh
|   |   |   |-- nf-gather.trace
|   |   |   `-- versions.yml
|   |   `-- multiple-read-sets-merged.txt
|   |-- qc
|   |   |-- COV419_tg_R1.fastq.gz
|   |   |-- COV419_tg_R2.fastq.gz
|   |   |-- extra
|   |   |   `-- EMPTY_EXTRA
|   |   |-- logs
|   |   |   |-- COV419_tg-fastp.log
|   |   |   |-- nf-qc.begin
|   |   |   |-- nf-qc.err
|   |   |   |-- nf-qc.log
|   |   |   |-- nf-qc.out
|   |   |   |-- nf-qc.run
|   |   |   |-- nf-qc.sh
|   |   |   |-- nf-qc.trace
|   |   |   `-- versions.yml
|   |   `-- summary
|   |       |-- COV419_tg.fastp.html
|   |       |-- COV419_tg.fastp.json
|   |       |-- COV419_tg_R1-final_fastqc.html
|   |       |-- COV419_tg_R1-final_fastqc.zip
|   |       |-- COV419_tg_R1-final.json
|   |       |-- COV419_tg_R1-original_fastqc.html
|   |       |-- COV419_tg_R1-original_fastqc.zip
|   |       |-- COV419_tg_R1-original.json
|   |       |-- COV419_tg_R2-final_fastqc.html
|   |       |-- COV419_tg_R2-final_fastqc.zip
|   |       |-- COV419_tg_R2-final.json
|   |       |-- COV419_tg_R2-original_fastqc.html
|   |       |-- COV419_tg_R2-original_fastqc.zip
|   |       `-- COV419_tg_R2-original.json
|   `-- sketcher
|       |-- COV419_tg-k21.msh
|       |-- COV419_tg-k31.msh
|       |-- COV419_tg-mash-refseq88-k21.txt
|       |-- COV419_tg.sig
|       |-- COV419_tg-sourmash-gtdb-rs207-k31.txt
|       `-- logs
|           |-- nf-sketcher.begin
|           |-- nf-sketcher.err
|           |-- nf-sketcher.log
|           |-- nf-sketcher.out
|           |-- nf-sketcher.run
|           |-- nf-sketcher.sh
|           |-- nf-sketcher.trace
|           `-- versions.yml
`-- tools
  |-- amrfinderplus
  |   |-- COV419_tg-genes.tsv
  |   |-- COV419_tg-proteins.tsv
  |   `-- logs
  |       |-- nf-amrfinderplus.begin
  |       |-- nf-amrfinderplus.err
  |       |-- nf-amrfinderplus.log
  |       |-- nf-amrfinderplus.out
  |       |-- nf-amrfinderplus.run
  |       |-- nf-amrfinderplus.sh
  |       |-- nf-amrfinderplus.trace
  |       `-- versions.yml
  `-- mlst
      |-- COV419_tg.tsv
      `-- logs
          |-- nf-mlst.begin
          |-- nf-mlst.err
          |-- nf-mlst.log
          |-- nf-mlst.out
          |-- nf-mlst.run
          |-- nf-mlst.sh
          |-- nf-mlst.trace
          `-- versions.yml

19 directories, 104 files

I received the same error when originally attempting to point directly to the tarball, e.g.

bactopia --wf kraken2 \
  -profile arcc_hawk \
  --cluster_opts '--account scw1773 --qos=maxjobs1500' \
  --max_cpus 8 \
  --bactopia /scratch/c.medib/bactopia/out_bactopia/d121_via_d83_COV_19122023JA1TO59_tg_c.medib_job56038718 \ 
  --kraken2_db /scratch/c.medib/.databases/k2_pluspf_08gb_20240112.tar.gz \
  --max_retry 0 \
  --cleanup_workdir 

Expected Behavior
Kraken2 executes using database specified by --kraken2_db

Execution Environment

  • Bactopia Version: 3.0.1 built 202311 with some merlin and cfg edits unrelated to kraken2
  • OS: RHEL or CentOS
  • Environment:
    SLURM-executed conda env running on a cluster for nf,
    nf+SLURM spinning up resulting singularity jobs on same cluster

Additional Information
⚠️ I'll probably stick with mashdist but thought I'd ask about this in case I'm ever desperate for bracken

@incoherentian incoherentian added the bug Something isn't working label Mar 5, 2024
@rpetit3
Copy link
Member

rpetit3 commented Mar 5, 2024

That's no bueno, and something I will look into.

Hope to update soon, hope all is well on your end!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants