Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in executing PROKKA #601

Open
anugos opened this issue Mar 13, 2024 · 4 comments
Open

Error in executing PROKKA #601

anugos opened this issue Mar 13, 2024 · 4 comments

Comments

@anugos
Copy link

anugos commented Mar 13, 2024

nf-core/mag v2.5.4-ge486bb2
Run Name: bq-mag19
nf-core/mag execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 2.

The full error message was:

Error executing process > 'NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-group-10.14)'

Caused by:
Process NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-group-10.14) terminated with an error exit status (2)

Command executed:

prokka
--metagenome
--cpus 2
--prefix MEGAHIT-MetaBAT2-group-10.14


MEGAHIT-MetaBAT2-group-10.14.fa

cat <<-END_VERSIONS > versions.yml
"NFCORE_MAG:MAG:PROKKA":
prokka: $(echo $(prokka --version 2>&1) | sed 's/^.*prokka //')
END_VERSIONS

Command exit status:
2

Command output:
(empty)

Command error:
[21:24:50] Determined blastp version is 002012 from 'blastp: 2.12.0+'
[21:24:50] Looking for 'cmpress' - found /usr/local/bin/cmpress
[21:24:50] Determined cmpress version is 001001 from '# INFERNAL 1.1.4 (Dec 2020)'
[21:24:50] Looking for 'cmscan' - found /usr/local/bin/cmscan
[21:24:50] Determined cmscan version is 001001 from '# INFERNAL 1.1.4 (Dec 2020)'
[21:24:50] Looking for 'egrep' - found /bin/egrep
[21:24:50] Looking for 'find' - found /usr/bin/find
[21:24:50] Looking for 'grep' - found /bin/grep
[21:24:50] Looking for 'hmmpress' - found /usr/local/bin/hmmpress
[21:24:50] Determined hmmpress version is 003003 from '# HMMER 3.3.2 (Nov 2020); http://hmmer.org/'
[21:24:50] Looking for 'hmmscan' - found /usr/local/bin/hmmscan
[21:24:50] Determined hmmscan version is 003003 from '# HMMER 3.3.2 (Nov 2020); http://hmmer.org/'
[21:24:50] Looking for 'java' - found /usr/local/bin/java
[21:24:50] Looking for 'makeblastdb' - found /usr/local/bin/makeblastdb
[21:24:50] Determined makeblastdb version is 002012 from 'makeblastdb: 2.12.0+'
[21:24:50] Looking for 'minced' - found /usr/local/bin/minced
[21:24:50] Determined minced version is 004002 from 'minced 0.4.2'
[21:24:50] Looking for 'parallel' - found /usr/local/bin/parallel
[21:24:50] Determined parallel version is 20220222 from 'GNU parallel 20220222'
[21:24:50] Looking for 'prodigal' - found /usr/local/bin/prodigal
[21:24:50] Determined prodigal version is 002006 from 'Prodigal V2.6.3: February, 2016'
[21:24:50] Looking for 'prokka-genbank_to_fasta_db' - found /usr/local/bin/prokka-genbank_to_fasta_db
[21:24:50] Looking for 'sed' - found /bin/sed
[21:24:50] Looking for 'tbl2asn' - found /usr/local/bin/tbl2asn
[21:24:51] Determined tbl2asn version is 025007 from 'tbl2asn 25.7 arguments:'
[21:24:51] Using genetic code table 11.
[21:24:51] Loading and checking input file: MEGAHIT-MetaBAT2-group-10.14.fa
[21:24:51] Wrote 65 contigs totalling 205963 bp.
[21:24:51] Predicting tRNAs and tmRNAs
[21:24:51] Running: aragorn -l -gc11 -w MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.fna
[21:24:51] Found 0 tRNAs
[21:24:51] Predicting Ribosomal RNAs
[21:24:51] Running Barrnap with 2 threads
[21:24:51] Found 0 rRNAs
[21:24:51] Skipping ncRNA search, enable with --rfam if desired.
[21:24:51] Total of 0 tRNA + rRNA features
[21:24:51] Searching for CRISPR repeats
[21:24:51] Found 0 CRISPRs
[21:24:51] Predicting coding sequences
[21:24:51] Contigs total 205963 bp, so using meta mode
[21:24:51] Running: prodigal -i MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.fna -c -m -g 11 -p meta -f sco -q
[21:24:52] Found 226 CDS
[21:24:52] Connecting features back to sequences
[21:24:52] Not using genus-specific database. Try --usegenus to enable it.
[21:24:52] Annotating CDS, please be patient.
[21:24:52] Will use 2 CPUs for similarity searching.
[21:24:52] There are still 226 unannotated CDS left (started with 226)
[21:24:52] Will use blast to search against /usr/local/db/kingdom/Bacteria/IS with 2 CPUs
[21:24:52] Running: cat MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.faa | parallel --gnu --plain -j 2 --block 14374 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.blast 2> /dev/null
[21:24:53] Could not run command: cat MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.faa | parallel --gnu --plain -j 2 --block 14374 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.blast 2> /dev/null

Work dir:
/data/user/anugos24/Black-Queen-analysis/Shotgun-Metagenome/redo_results_2024/work/d8/6fda25e960ff9bd0d71920b903df93

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
The workflow was completed at 2024-03-12T21:29:39.822735-05:00 (duration: 5m 5s)

The command used to launch the workflow was as follows:

nextflow run nf-core/mag -r 2.5.4 -name bq-mag19 -profile singularity -params-file nf-params.json -c custom.config -resume bq-mag18
Pipeline Configuration:
revision
2.5.4
runName
bq-mag19
containerEngine
singularity
container
[PROKKA:https://depot.galaxyproject.org/singularity/prokka:1.14.6--pl5321hdfd78af_4]

profile
singularity
configFiles

phix_reference
/home/anugos24/.nextflow/assets/nf-core/mag/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz
lambda_reference
/home/anugos24/.nextflow/assets/nf-core/mag/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz
kraken2_db
/data/user/database/minikraken_8GB_202003.tgz
skip_krona
true
gtdbtk_min_perc_aa
10
gtdbtk_pplacer_cpus
1
coassemble_group
true
megahit_options
--presets meta-large
skip_spades
true
skip_spadeshybrid
true
skip_prodigal
true
skip_metaeuk
true
skip_maxbin2
true
skip_concoct
true
bowtie2_mode
--very-sensitive
save_assembly_mapped_reads
true
busco_db
/data/user/bacteria_odb10.2020-03-06.tar.gz
busco_auto_lineage_prok
true
busco_clean
true

Nextflow Version
23.10.1
Nextflow Build
5891
Nextflow Compile Timestamp
12-01-2024 22:01 UTC
nf-core/mag

@jfy133
Copy link
Member

jfy133 commented Mar 15, 2024

Hi @anugos
This seems to be a common and 'unresolved' prokka error.
The recommendation is posted here: tseemann/prokka#402 (comment)

Please install PROKKA manually (e.g. via conda), cd into the work directory reported into the error, then use the command in the .command.sh file to re-run prokka, but without redirecting the stdout/in

@roberta-davidson
Copy link

roberta-davidson commented Mar 20, 2024

Hey @anugos @jfy133 ! Found a bit of a workaround. I downloaded this container for Prokka and then modified my config to use this container. Also would not work via slurm submssion to our HPC, but did on the head node (?!), and then had to modify to run 1 at a time so tmp directories for prokka didn't overwrite eachother. On second thought, maybe using a different container was unnecessary but anyway..
Overall additions to config file:

process {
   executor = 'slurm'
   clusterOptions="-N 1 -p skylake,icelake"
  withName: PROKKA {
    container = '/<path>/prokka_1.14.6--pl5321hdfd78af_5.sif'
    executor = 'local'
    maxForks = 1
  }
}

@jfy133
Copy link
Member

jfy133 commented Mar 20, 2024

@roberta-davidson huh interesting... what was the actual error for you (i.e., what was otehrwise piped to nothing?

Is it a /tmp clash or something? This we can maybe set to use a the process' specific work directory...

@roberta-davidson
Copy link

roberta-davidson commented Mar 20, 2024

The original command in error from running mag was:

  [17:36:49] Could not run command: cat MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\/MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\.IS\.tmp\.44\.faa | parallel --gnu --plain -j 2 --block 43333 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\/MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\.IS\.tmp\.44\.blast 2> /dev/null

I really don't understand why this workaround works...
I set out to do as you suggest above, and wrote a script to run .command.sh in each work dir using my own prokka container, and then @shyama-mama figured out to just adjust the config and point to that container when the pipeline runs. Then realised that .command.sh with my container worked on head node but not within the pipeline (no idea why). Then adjusted to execute locally, and one at a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants