Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for presupplied annotation files (FAA + GFF or FAA + GBK) #340

Open
wants to merge 54 commits into
base: dev
Choose a base branch
from

Conversation

jfy133
Copy link
Member

@jfy133 jfy133 commented Feb 14, 2024

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/funcscan branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Feb 14, 2024

nf-core lint overall result: Passed ✅

Posted for pipeline commit 2d8b238

+| ✅ 315 tests passed       |+

✅ Tests passed:

  • files_exist - File found: .gitattributes
  • files_exist - File found: .gitignore
  • files_exist - File found: .nf-core.yml
  • files_exist - File found: .editorconfig
  • files_exist - File found: .prettierignore
  • files_exist - File found: .prettierrc.yml
  • files_exist - File found: CHANGELOG.md
  • files_exist - File found: CITATIONS.md
  • files_exist - File found: CODE_OF_CONDUCT.md
  • files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
  • files_exist - File found: nextflow_schema.json
  • files_exist - File found: nextflow.config
  • files_exist - File found: README.md
  • files_exist - File found: .github/.dockstore.yml
  • files_exist - File found: .github/CONTRIBUTING.md
  • files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
  • files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
  • files_exist - File found: .github/workflows/branch.yml
  • files_exist - File found: .github/workflows/ci.yml
  • files_exist - File found: .github/workflows/linting_comment.yml
  • files_exist - File found: .github/workflows/linting.yml
  • files_exist - File found: assets/email_template.html
  • files_exist - File found: assets/email_template.txt
  • files_exist - File found: assets/sendmail_template.txt
  • files_exist - File found: assets/nf-core-funcscan_logo_light.png
  • files_exist - File found: conf/modules.config
  • files_exist - File found: conf/test.config
  • files_exist - File found: conf/test_full.config
  • files_exist - File found: docs/images/nf-core-funcscan_logo_light.png
  • files_exist - File found: docs/images/nf-core-funcscan_logo_dark.png
  • files_exist - File found: docs/output.md
  • files_exist - File found: docs/README.md
  • files_exist - File found: docs/README.md
  • files_exist - File found: docs/usage.md
  • files_exist - File found: main.nf
  • files_exist - File found: assets/multiqc_config.yml
  • files_exist - File found: conf/base.config
  • files_exist - File found: conf/igenomes.config
  • files_exist - File found: .github/workflows/awstest.yml
  • files_exist - File found: .github/workflows/awsfulltest.yml
  • files_exist - File found: modules.json
  • files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
  • files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
  • files_exist - File not found check: .github/workflows/push_dockerhub.yml
  • files_exist - File not found check: .markdownlint.yml
  • files_exist - File not found check: .nf-core.yaml
  • files_exist - File not found check: .yamllint.yml
  • files_exist - File not found check: bin/markdown_to_html.r
  • files_exist - File not found check: conf/aws.config
  • files_exist - File not found check: docs/images/nf-core-funcscan_logo.png
  • files_exist - File not found check: lib/Checks.groovy
  • files_exist - File not found check: lib/Completion.groovy
  • files_exist - File not found check: lib/NfcoreTemplate.groovy
  • files_exist - File not found check: lib/Utils.groovy
  • files_exist - File not found check: lib/Workflow.groovy
  • files_exist - File not found check: lib/WorkflowMain.groovy
  • files_exist - File not found check: lib/WorkflowFuncscan.groovy
  • files_exist - File not found check: parameters.settings.json
  • files_exist - File not found check: pipeline_template.yml
  • files_exist - File not found check: Singularity
  • files_exist - File not found check: lib/nfcore_external_java_deps.jar
  • files_exist - File not found check: .travis.yml
  • nextflow_config - Config variable found: manifest.name
  • nextflow_config - Config variable found: manifest.nextflowVersion
  • nextflow_config - Config variable found: manifest.description
  • nextflow_config - Config variable found: manifest.version
  • nextflow_config - Config variable found: manifest.homePage
  • nextflow_config - Config variable found: timeline.enabled
  • nextflow_config - Config variable found: trace.enabled
  • nextflow_config - Config variable found: report.enabled
  • nextflow_config - Config variable found: dag.enabled
  • nextflow_config - Config variable found: process.cpus
  • nextflow_config - Config variable found: process.memory
  • nextflow_config - Config variable found: process.time
  • nextflow_config - Config variable found: params.outdir
  • nextflow_config - Config variable found: params.input
  • nextflow_config - Config variable found: params.validationShowHiddenParams
  • nextflow_config - Config variable found: params.validationSchemaIgnoreParams
  • nextflow_config - Config variable found: manifest.mainScript
  • nextflow_config - Config variable found: timeline.file
  • nextflow_config - Config variable found: trace.file
  • nextflow_config - Config variable found: report.file
  • nextflow_config - Config variable found: dag.file
  • nextflow_config - Config variable (correctly) not found: params.nf_required_version
  • nextflow_config - Config variable (correctly) not found: params.container
  • nextflow_config - Config variable (correctly) not found: params.singleEnd
  • nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
  • nextflow_config - Config variable (correctly) not found: params.name
  • nextflow_config - Config variable (correctly) not found: params.enable_conda
  • nextflow_config - Config timeline.enabled had correct value: true
  • nextflow_config - Config report.enabled had correct value: true
  • nextflow_config - Config trace.enabled had correct value: true
  • nextflow_config - Config dag.enabled had correct value: true
  • nextflow_config - Config manifest.name began with nf-core/
  • nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
  • nextflow_config - Config dag.file ended with .html
  • nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
  • nextflow_config - Config manifest.version ends in dev: 1.2.0dev
  • nextflow_config - Config params.custom_config_version is set to master
  • nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
  • nextflow_config - Lines for loading custom profiles found
  • nextflow_config - nextflow.config contains configuration profile test
  • nextflow_config - Config default value correct: params.contig_qc_lengththreshold= 3000.0
  • nextflow_config - Config default value correct: params.taxa_classification_tool= mmseqs2
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_databases_id= Kalamari
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_searchtype= 2
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_lcaranks= kingdom,phylum,class,order,family,genus,species
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_taxlineage= 1
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_sensitivity= 5.0
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_orffilters= 2.0
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_lcamode= 3
  • nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_votemode= 1
  • nextflow_config - Config default value correct: params.annotation_tool= pyrodigal
  • nextflow_config - Config default value correct: params.annotation_bakta_db_downloadtype= full
  • nextflow_config - Config default value correct: params.annotation_bakta_mincontiglen= 1
  • nextflow_config - Config default value correct: params.annotation_bakta_translationtable= 11
  • nextflow_config - Config default value correct: params.annotation_bakta_gram= ?
  • nextflow_config - Config default value correct: params.annotation_prokka_kingdom= Bacteria
  • nextflow_config - Config default value correct: params.annotation_prokka_gcode= 11
  • nextflow_config - Config default value correct: params.annotation_prokka_mincontiglen= 1
  • nextflow_config - Config default value correct: params.annotation_prokka_evalue= 1e-06
  • nextflow_config - Config default value correct: params.annotation_prokka_coverage= 80
  • nextflow_config - Config default value correct: params.annotation_prokka_compliant= true
  • nextflow_config - Config default value correct: params.annotation_prodigal_transtable= 11
  • nextflow_config - Config default value correct: params.annotation_pyrodigal_transtable= 11
  • nextflow_config - Config default value correct: params.amp_ampir_model= precursor
  • nextflow_config - Config default value correct: params.amp_ampir_minlength= 10
  • nextflow_config - Config default value correct: params.amp_ampcombi_cutoff= 0.0
  • nextflow_config - Config default value correct: params.arg_amrfinderplus_identmin= -1.0
  • nextflow_config - Config default value correct: params.arg_amrfinderplus_coveragemin= 0.5
  • nextflow_config - Config default value correct: params.arg_amrfinderplus_translationtable= 11
  • nextflow_config - Config default value correct: params.arg_deeparg_data_version= 2
  • nextflow_config - Config default value correct: params.arg_deeparg_model= LS
  • nextflow_config - Config default value correct: params.arg_deeparg_minprob= 0.8
  • nextflow_config - Config default value correct: params.arg_deeparg_alignmentevalue= 1e-10
  • nextflow_config - Config default value correct: params.arg_deeparg_alignmentidentity= 50
  • nextflow_config - Config default value correct: params.arg_deeparg_alignmentoverlap= 0.8
  • nextflow_config - Config default value correct: params.arg_deeparg_numalignmentsperentry= 1000
  • nextflow_config - Config default value correct: params.arg_fargene_hmmmodel= class_a,class_b_1_2,class_b_3,class_c,class_d_1,class_d_2,qnr,tet_efflux,tet_rpg,tet_enzyme
  • nextflow_config - Config default value correct: params.arg_fargene_minorflength= 90
  • nextflow_config - Config default value correct: params.arg_fargene_translationformat= pearson
  • nextflow_config - Config default value correct: params.arg_rgi_savejson= false
  • nextflow_config - Config default value correct: params.arg_rgi_savetmpfiles= false
  • nextflow_config - Config default value correct: params.arg_rgi_alignmenttool= BLAST
  • nextflow_config - Config default value correct: params.arg_rgi_includeloose= false
  • nextflow_config - Config default value correct: params.arg_rgi_includenudge= false
  • nextflow_config - Config default value correct: params.arg_rgi_lowquality= false
  • nextflow_config - Config default value correct: params.arg_rgi_data= NA
  • nextflow_config - Config default value correct: params.arg_rgi_split_prodigal_jobs= true
  • nextflow_config - Config default value correct: params.arg_abricate_db= ncbi
  • nextflow_config - Config default value correct: params.arg_abricate_minid= 80
  • nextflow_config - Config default value correct: params.arg_abricate_mincov= 80
  • nextflow_config - Config default value correct: params.bgc_antismash_contigminlength= 1000
  • nextflow_config - Config default value correct: params.bgc_antismash_hmmdetectionstrictness= relaxed
  • nextflow_config - Config default value correct: params.bgc_antismash_taxon= bacteria
  • nextflow_config - Config default value correct: params.bgc_deepbgc_score= 0.5
  • nextflow_config - Config default value correct: params.bgc_deepbgc_mergemaxproteingap= 0
  • nextflow_config - Config default value correct: params.bgc_deepbgc_mergemaxnuclgap= 0
  • nextflow_config - Config default value correct: params.bgc_deepbgc_minnucl= 1
  • nextflow_config - Config default value correct: params.bgc_deepbgc_minproteins= 1
  • nextflow_config - Config default value correct: params.bgc_deepbgc_mindomains= 1
  • nextflow_config - Config default value correct: params.bgc_deepbgc_minbiodomains= 0
  • nextflow_config - Config default value correct: params.bgc_deepbgc_classifierscore= 0.5
  • nextflow_config - Config default value correct: params.bgc_gecco_cds= 3
  • nextflow_config - Config default value correct: params.bgc_gecco_pfilter= 1e-09
  • nextflow_config - Config default value correct: params.bgc_gecco_threshold= 0.8
  • nextflow_config - Config default value correct: params.bgc_gecco_edgedistance= 0
  • nextflow_config - Config default value correct: params.arg_hamronization_summarizeformat= tsv
  • nextflow_config - Config default value correct: params.custom_config_version= master
  • nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
  • nextflow_config - Config default value correct: params.max_cpus= 16
  • nextflow_config - Config default value correct: params.max_memory= 128.GB
  • nextflow_config - Config default value correct: params.max_time= 240.h
  • nextflow_config - Config default value correct: params.publish_dir_mode= copy
  • nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
  • nextflow_config - Config default value correct: params.validate_params= true
  • nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
  • files_unchanged - .gitattributes matches the template
  • files_unchanged - .prettierrc.yml matches the template
  • files_unchanged - CODE_OF_CONDUCT.md matches the template
  • files_unchanged - LICENSE matches the template
  • files_unchanged - .github/.dockstore.yml matches the template
  • files_unchanged - .github/CONTRIBUTING.md matches the template
  • files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
  • files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
  • files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
  • files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
  • files_unchanged - .github/workflows/branch.yml matches the template
  • files_unchanged - .github/workflows/linting_comment.yml matches the template
  • files_unchanged - .github/workflows/linting.yml matches the template
  • files_unchanged - assets/email_template.html matches the template
  • files_unchanged - assets/email_template.txt matches the template
  • files_unchanged - assets/sendmail_template.txt matches the template
  • files_unchanged - assets/nf-core-funcscan_logo_light.png matches the template
  • files_unchanged - docs/images/nf-core-funcscan_logo_light.png matches the template
  • files_unchanged - docs/images/nf-core-funcscan_logo_dark.png matches the template
  • files_unchanged - docs/README.md matches the template
  • files_unchanged - .gitignore matches the template
  • files_unchanged - .prettierignore matches the template
  • actions_ci - '.github/workflows/ci.yml' is triggered on expected events
  • actions_ci - '.github/workflows/ci.yml' checks minimum NF version
  • actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
  • actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
  • actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
  • readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
  • readme - README Zenodo placeholder was replaced with DOI.
  • pipeline_todos - No TODO strings found
  • pipeline_name_conventions - Name adheres to nf-core convention
  • template_strings - Did not find any Jinja template strings (313 files)
  • schema_lint - Schema lint passed
  • schema_lint - Schema title + description lint passed
  • schema_lint - Input mimetype lint passed: 'text/csv'
  • schema_params - Schema matched params returned from nextflow config
  • system_exit - No System.exit calls found
  • actions_schema_validation - Workflow validation passed: branch.yml
  • actions_schema_validation - Workflow validation passed: ci.yml
  • actions_schema_validation - Workflow validation passed: awsfulltest.yml
  • actions_schema_validation - Workflow validation passed: fix-linting.yml
  • actions_schema_validation - Workflow validation passed: linting.yml
  • actions_schema_validation - Workflow validation passed: download_pipeline.yml
  • actions_schema_validation - Workflow validation passed: release-announcements.yml
  • actions_schema_validation - Workflow validation passed: clean-up.yml
  • actions_schema_validation - Workflow validation passed: awstest.yml
  • actions_schema_validation - Workflow validation passed: linting_comment.yml
  • merge_markers - No merge markers found in pipeline files
  • modules_json - Only installed modules found in modules.json
  • multiqc_config - assets/multiqc_config.yml found and not ignored.
  • multiqc_config - assets/multiqc_config.yml contains report_section_order
  • multiqc_config - assets/multiqc_config.yml contains export_plots
  • multiqc_config - assets/multiqc_config.yml contains report_comment
  • multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
  • multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
  • multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
  • modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
  • base_config - conf/base.config found and not ignored.
  • base_config - GUNZIP found in conf/base.config and Nextflow scripts.
  • base_config - UNTAR found in conf/base.config and Nextflow scripts.
  • base_config - PROKKA found in conf/base.config and Nextflow scripts.
  • base_config - PRODIGAL_GBK found in conf/base.config and Nextflow scripts.
  • base_config - BAKTA_BAKTA found in conf/base.config and Nextflow scripts.
  • base_config - ABRICATE_RUN found in conf/base.config and Nextflow scripts.
  • base_config - AMRFINDERPLUS_RUN found in conf/base.config and Nextflow scripts.
  • base_config - DEEPARG_DOWNLOADDATA found in conf/base.config and Nextflow scripts.
  • base_config - DEEPARG_PREDICT found in conf/base.config and Nextflow scripts.
  • base_config - FARGENE found in conf/base.config and Nextflow scripts.
  • base_config - RGI_MAIN found in conf/base.config and Nextflow scripts.
  • base_config - AMPIR found in conf/base.config and Nextflow scripts.
  • base_config - AMPLIFY_PREDICT found in conf/base.config and Nextflow scripts.
  • base_config - AMP_HMMER_HMMSEARCH found in conf/base.config and Nextflow scripts.
  • base_config - MACREL_CONTIGS found in conf/base.config and Nextflow scripts.
  • base_config - BGC_HMMER_HMMSEARCH found in conf/base.config and Nextflow scripts.
  • base_config - ANTISMASH_ANTISMASHLITE found in conf/base.config and Nextflow scripts.
  • base_config - ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES found in conf/base.config and Nextflow scripts.
  • base_config - DEEPBGC_DOWNLOAD found in conf/base.config and Nextflow scripts.
  • base_config - DEEPBGC_PIPELINE found in conf/base.config and Nextflow scripts.
  • base_config - GECCO_RUN found in conf/base.config and Nextflow scripts.
  • base_config - HAMRONIZATION_ABRICATE found in conf/base.config and Nextflow scripts.
  • base_config - HAMRONIZATION_AMRFINDERPLUS found in conf/base.config and Nextflow scripts.
  • base_config - HAMRONIZATION_DEEPARG found in conf/base.config and Nextflow scripts.
  • base_config - HAMRONIZATION_RGI found in conf/base.config and Nextflow scripts.
  • base_config - HAMRONIZATION_FARGENE found in conf/base.config and Nextflow scripts.
  • base_config - HAMRONIZATION_SUMMARIZE found in conf/base.config and Nextflow scripts.
  • base_config - AMPCOMBI found in conf/base.config and Nextflow scripts.
  • modules_config - conf/modules.config found and not ignored.
  • modules_config - MULTIQC found in conf/modules.config and Nextflow scripts.
  • modules_config - GUNZIP found in conf/modules.config and Nextflow scripts.
  • modules_config - SEQKIT_SEQ_LONG found in conf/modules.config and Nextflow scripts.
  • modules_config - SEQKIT_SEQ_SHORT found in conf/modules.config and Nextflow scripts.
  • modules_config - MMSEQS_DATABASES found in conf/modules.config and Nextflow scripts.
  • modules_config - MMSEQS_CREATEDB found in conf/modules.config and Nextflow scripts.
  • modules_config - MMSEQS_TAXONOMY found in conf/modules.config and Nextflow scripts.
  • modules_config - MMSEQS_CREATETSV found in conf/modules.config and Nextflow scripts.
  • modules_config - PROKKA found in conf/modules.config and Nextflow scripts.
  • modules_config - BAKTA_BAKTADBDOWNLOAD found in conf/modules.config and Nextflow scripts.
  • modules_config - BAKTA_BAKTA found in conf/modules.config and Nextflow scripts.
  • modules_config - PRODIGAL found in conf/modules.config and Nextflow scripts.
  • modules_config - PYRODIGAL found in conf/modules.config and Nextflow scripts.
  • modules_config - ABRICATE_RUN found in conf/modules.config and Nextflow scripts.
  • modules_config - AMRFINDERPLUS_UPDATE found in conf/modules.config and Nextflow scripts.
  • modules_config - AMRFINDERPLUS_RUN found in conf/modules.config and Nextflow scripts.
  • modules_config - DEEPARG_DOWNLOADDATA found in conf/modules.config and Nextflow scripts.
  • modules_config - DEEPARG_PREDICT found in conf/modules.config and Nextflow scripts.
  • modules_config - FARGENE found in conf/modules.config and Nextflow scripts.
  • modules_config - UNTAR_CARD found in conf/modules.config and Nextflow scripts.
  • modules_config - RGI_CARDANNOTATION found in conf/modules.config and Nextflow scripts.
  • modules_config - RGI_MAIN found in conf/modules.config and Nextflow scripts.
  • modules_config - AMPIR found in conf/modules.config and Nextflow scripts.
  • modules_config - AMPLIFY_PREDICT found in conf/modules.config and Nextflow scripts.
  • modules_config - AMP_HMMER_HMMSEARCH found in conf/modules.config and Nextflow scripts.
  • modules_config - MACREL_CONTIGS found in conf/modules.config and Nextflow scripts.
  • modules_config - BGC_HMMER_HMMSEARCH found in conf/modules.config and Nextflow scripts.
  • modules_config - ANTISMASH_ANTISMASHLITE found in conf/modules.config and Nextflow scripts.
  • modules_config - ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES found in conf/modules.config and Nextflow scripts.
  • modules_config - DEEPBGC_DOWNLOAD found in conf/modules.config and Nextflow scripts.
  • modules_config - DEEPBGC_PIPELINE found in conf/modules.config and Nextflow scripts.
  • modules_config - GECCO_RUN found in conf/modules.config and Nextflow scripts.
  • modules_config - HAMRONIZATION_ABRICATE found in conf/modules.config and Nextflow scripts.
  • modules_config - HAMRONIZATION_AMRFINDERPLUS found in conf/modules.config and Nextflow scripts.
  • modules_config - HAMRONIZATION_DEEPARG found in conf/modules.config and Nextflow scripts.
  • modules_config - HAMRONIZATION_RGI found in conf/modules.config and Nextflow scripts.
  • modules_config - HAMRONIZATION_FARGENE found in conf/modules.config and Nextflow scripts.
  • modules_config - HAMRONIZATION_SUMMARIZE found in conf/modules.config and Nextflow scripts.
  • modules_config - MERGE_TAXONOMY_HAMRONIZATION found in conf/modules.config and Nextflow scripts.
  • modules_config - ARG_TABIX_BGZIP found in conf/modules.config and Nextflow scripts.
  • modules_config - AMPCOMBI found in conf/modules.config and Nextflow scripts.
  • modules_config - MERGE_TAXONOMY_AMPCOMBI found in conf/modules.config and Nextflow scripts.
  • modules_config - AMP_TABIX_BGZIP found in conf/modules.config and Nextflow scripts.
  • modules_config - COMBGC found in conf/modules.config and Nextflow scripts.
  • modules_config - MERGE_TAXONOMY_COMBGC found in conf/modules.config and Nextflow scripts.
  • modules_config - BGC_TABIX_BGZIP found in conf/modules.config and Nextflow scripts.
  • modules_config - DRAMP_DOWNLOAD found in conf/modules.config and Nextflow scripts.
  • nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
  • nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 2.14.1

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-05-22 14:02:15

docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
jasmezz and others added 2 commits April 10, 2024 11:47
Copy link
Member Author

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to change feature -> gbk everywhere if we want to go with that route, including also the schema_input.tsv etc, and would also have to update the test-data samplesheet files 😬

Copy link
Contributor

@Darcy220606 Darcy220606 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few comments to consider :) but good job 💯

meta, files ->
def fasta_found = files.find{it.toString().tokenize('.').last().matches('fasta|fas|fna|fa')}
def faa_found = files.find{it.toString().endsWith('.faa')}
def gbk_found = files.find{it.toString().tokenize('.').last().matches('gbk')}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also add 'gbk|gbff' also as that is the gbk extension output for bakta.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill add it here as i couldnt comment on it for some reason :D
So in line 188

    // TODO: Only NT at the moment. AA tax. classification will be added only when its PR is merged.

remove this because now that the user supllies always both fasta and gbk files for the preannotated track, we dont need to update the taxonomy workflow ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FFFFF bakta!? WHY!?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh that's a good question.. currently only FASTAs go to taxonomy... I can't remember if that's preanno ones too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, the user doesn't always supply GBK, I don't understand now...?

ch_versions = ch_versions.mix( SEQKIT_SEQ_LONG.out.versions )
ch_versions = ch_versions.mix( SEQKIT_SEQ_SHORT.out.versions )

ch_prepped_input_long = SEQKIT_SEQ_LONG.out.fastx
ch_intermediate_input_long = SEQKIT_SEQ_LONG.out.fastx
.map{ meta, file -> [ meta + [id: meta.id + '_long', length: "long" ], file ] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check the actual final output files (.tsv) with the test_taxonomy.config ? Do please correct me if im wrong with understanding the workflow now, so we are now renaming the meta.ids with suffixes _longand _short, will that not interfere with adding the taxonomy to the right files because taxonomy is added according to not only contig but sample_id too which comes from the meta_id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, which is why I asked you to test 😆

If I understand your question correctly:

meta.id indeeds comes from sample, so updating meta is replacing the sample_id with the suffix

But the taxonomy workflow takes input from after this renaming, so it should be OK i hope? Please run the branch and check 😬

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards to the summary and taxonomy :

  • ampcombi: it doesnt output all samples (but that will be fixed when i put in the new ampcombi submodules). Taxonomy is merged

  • combgc: it doesnt consider the samples_1 and 2 that are preannotated, only takes those from the annotatation step in the final report, which i dont understand why. Taxonomy is merged

  • hamronization: Works perfectly.

So either therees a problem with this collectfile() function fro both ampcombi/combgc or somethings up with the publishdir path, that every time the pipeline stops and resumes it rewrites the final report.

ch_versions = ch_versions.mix( SEQKIT_SEQ_LONG.out.versions )
ch_versions = ch_versions.mix( SEQKIT_SEQ_SHORT.out.versions )

ch_prepped_input_long = SEQKIT_SEQ_LONG.out.fastx
ch_intermediate_input_long = SEQKIT_SEQ_LONG.out.fastx
.map{ meta, file -> [ meta + [id: meta.id + '_long', length: "long" ], file ] }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, which is why I asked you to test 😆

If I understand your question correctly:

meta.id indeeds comes from sample, so updating meta is replacing the sample_id with the suffix

But the taxonomy workflow takes input from after this renaming, so it should be OK i hope? Please run the branch and check 😬

meta, files ->
def fasta_found = files.find{it.toString().tokenize('.').last().matches('fasta|fas|fna|fa')}
def faa_found = files.find{it.toString().endsWith('.faa')}
def gbk_found = files.find{it.toString().tokenize('.').last().matches('gbk')}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FFFFF bakta!? WHY!?

workflows/funcscan.nf Outdated Show resolved Hide resolved
meta, files ->
def fasta_found = files.find{it.toString().tokenize('.').last().matches('fasta|fas|fna|fa')}
def faa_found = files.find{it.toString().endsWith('.faa')}
def gbk_found = files.find{it.toString().tokenize('.').last().matches('gbk')}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh that's a good question.. currently only FASTAs go to taxonomy... I can't remember if that's preanno ones too

workflows/funcscan.nf Outdated Show resolved Hide resolved
workflows/funcscan.nf Outdated Show resolved Hide resolved
Comment on lines 189 to 191
if ( params.run_taxa_classification ) {
TAXA_CLASS ( ch_prepped_input )
TAXA_CLASS ( ch_prepped_input.fastas )
ch_versions = ch_versions.mix( TAXA_CLASS.out.versions )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( params.run_taxa_classification ) {
TAXA_CLASS ( ch_prepped_input )
TAXA_CLASS ( ch_prepped_input.fastas )
ch_versions = ch_versions.mix( TAXA_CLASS.out.versions )
ch_intermediate_fasta_for_taxa = ch_intermediate_input.fastas.map{ meta, fasta, faa, gbk -> [ meta, fasta ] }
.mix(ch_intermediate_input.preannotated.map{ meta, fasta, faa, gbk -> [ meta, fasta ] })
if ( params.run_taxa_classification ) {
TAXA_CLASS ( ch_intermediate_fasta_for_taxa )
ch_versions = ch_versions.mix( TAXA_CLASS.out.versions )

ch_versions = ch_versions.mix( SEQKIT_SEQ_LONG.out.versions )
ch_versions = ch_versions.mix( SEQKIT_SEQ_SHORT.out.versions )

ch_prepped_input_long = SEQKIT_SEQ_LONG.out.fastx
ch_intermediate_input_long = SEQKIT_SEQ_LONG.out.fastx
.map{ meta, file -> [ meta + [id: meta.id + '_long', length: "long" ], file ] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards to the summary and taxonomy :

  • ampcombi: it doesnt output all samples (but that will be fixed when i put in the new ampcombi submodules). Taxonomy is merged

  • combgc: it doesnt consider the samples_1 and 2 that are preannotated, only takes those from the annotatation step in the final report, which i dont understand why. Taxonomy is merged

  • hamronization: Works perfectly.

So either therees a problem with this collectfile() function fro both ampcombi/combgc or somethings up with the publishdir path, that every time the pipeline stops and resumes it rewrites the final report.

@jfy133
Copy link
Member Author

jfy133 commented May 15, 2024

OK tests to run:

  • 1 No pre-annotated without taxonomy AMP/ARG
  • 2 No pre-annotated with taxonomy AMP/ARG
  • 3 GBK preannotated without taxonomy AMP/ARG
  • 4 GBK preannotated with taxonomy AMP/ARG
  • 4 GBFF preannotated without taxonomy AMP/ARG
  • 6 GBFF preannotated with taxonomy AMP/ARG
  • 7 No pre-annotated without taxonomy BGC -> skipping antismash because problems on my install
  • 8 No pre-annotated with taxonomy BGC -> skipping antismash because problems on my install
  • 9 GBK preannotated without taxonomy BGC -> merge_taxonomy.py script not working correctly, only has single entry despite output in the others
  • 10 GBK preannotated with taxonomy BGC -> merge_taxonomy.py script not working correctly, only has single entry despite output in the others
  • 11 GBFF preannotated without taxonomy BGC
  • 12 GBFF preannotated with taxonomy BGC
  • 13 GBK preannotated without taxonomy BGC/AMP/ARG
  • 14 GBK preannotated with taxonomy BGC/AMP/ARG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants