Use eval output for tool versions #1115

bentsherman · 2023-11-15T22:41:56Z

This PR uses the experimental cmd output type in nextflow-io/nextflow#4493 to simplify the collection of tool versions.

Once the topic channel support is merged into Nextflow, we can merge this PR with #1109 to simplify things further. Instead of emitting versions1, versions2, etc for processes with multiple tools, we can simply send them all to the 'versions' topic.

PR checklist

This comment contains a description of changes (with reason).
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

github-actions · 2023-11-15T22:58:42Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit e68f451

+| ✅ 144 tests passed       |+
#| ❔   6 tests were ignored |#
!| ❗   5 tests had warnings |!

❗ Test warnings:

files_exist - File not found: .github/workflows/awstest.yml
files_exist - File not found: .github/workflows/awsfulltest.yml
nextflow_config - Config manifest.version should end in dev: 3.13.0
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
pipeline_todos - TODO string in WorkflowRnaseq.groovy: Optionally add in-text citation tools to this list.

❔ Tests ignored:

files_unchanged - File ignored due to lint config: assets/email_template.html
files_unchanged - File ignored due to lint config: assets/email_template.txt
files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy
files_unchanged - File ignored due to lint config: .gitignore or .prettierignore or pyproject.toml
actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/rnaseq/rnaseq/.github/workflows/awstest.yml
multiqc_config - multiqc_config

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-rnaseq_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-rnaseq_logo_light.png
files_exist - File found: docs/images/nf-core-rnaseq_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: lib/WorkflowRnaseq.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-rnaseq_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-rnaseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-rnaseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-rnaseq_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
readme - README Zenodo placeholder was replaced with DOI.
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (257 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: release-announcments.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: cloud_tests_small.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: cloud_tests_full.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.10
Run at 2023-11-16 01:26:58

bentsherman · 2023-11-27T22:26:21Z

@drpatelh @ewels now that Nextflow has channel topics, it occurred to me that we could actually simplify a lot by just using env outputs. See my comment here, but I will copy the example code to illustrate my point:

// current nf-core convention
process FOOBAR {
    output:
    path 'versions.yml', topic: versions

    """
    # ...

    cat <<-END_VERSIONS > versions.yml
    "${task.process}":
        foo: \$(foo --version)
        bar: \$(bar --version)
    END_VERSIONS
    """
}

// env output
process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), env(FOO_VERSION), topic: versions
    tuple val("${task.process}"), val('bar'), env(BAR_VERSION), topic: versions

    """
    # ...

    FOO_VERSION=\$(foo --version)
    BAR_VERSION=\$(bar --version)
    """
}

// cmd output
process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), cmd('foo --version'), topic: versions
    tuple val("${task.process}"), val('bar'), cmd('bar --version'), topic: versions

    """
    # ...
    """
}

I would love to hear what you guys think about (2) vs (3). Keep in mind that in all three cases, the tool version commands are executed in the task script in more or less the same way.

ewels · 2023-12-11T14:57:48Z

My preference is for option 3 - the new cmd style. I like keeping the version commands out of the script, as it makes the script commands much cleaner and easier to read.

NB: The versions1/versions2 stuff in the PR code diff can be simplified after #1109 is merged. This new syntax is shown in Ben's comment.

maxulysse · 2023-12-11T15:04:04Z

My preference goes to version2, which I find more explicit, easier to read, but I do love the version3 that removes completely the version generation from the script itself.

edmundmiller · 2023-12-11T15:09:30Z

I like three, my only concern is some of the commands to get the version get pretty long. In theory, we could do something like:

def foo_version = 'foo --version'

output:
    tuple val("${task.process}"), val('foo'), cmd("${foo_version}"), topic: versions

mirpedrol · 2023-12-11T15:10:04Z

I like option 3!
But I wonder about modules with R or python scripts, where we use those languages to create the versions.yml instead of bash. Will this work? or do we have to continue using the old syntax for these cases?

pinin4fjords · 2023-12-11T15:21:34Z

I would really like the inputs/ outputs section to remain as concise as possible, and I like the separation of concerns where the command to produce the output happens in more or less the same place. I'd do a bit of a WTF if people suddenly started embedding extensive process stuff where I expect the I/O.

So I have a fairly strong dislike for option 3), I think some fairly horrific stuff could happen there and make the processes hard to understand.

So option 2) for me please!

maxulysse · 2023-12-11T15:23:56Z

Agreeing with @pinin4fjords there, version3 looks beautiful as long as all works well, when it starts to bug, it's a mess to debug.

ewels · 2023-12-11T15:34:42Z

@pinin4fjords - note that one of the limitations of cmd (which will be documented) is that it doesn't support newlines.

That will hopefully prevent people from doing anything too horrendous 😆

We could have an nf-core modules linting rule that checks the string length and fails if it's too long, suggesting that people use env in that particular case instead.

pinin4fjords · 2023-12-11T15:37:58Z

@pinin4fjords - note that one of the limitations of cmd (which will be documented) is that it doesn't support newlines.

That will hopefully prevent people from doing anything too horrendous 😆

There's plenty of evil to be done with pipes!

ewels · 2023-12-11T15:39:16Z

But I wonder about modules with R or python scripts, where we use those languages to create the versions.yml instead of bash. Will this work? or do we have to continue using the old syntax for these cases?

@mirpedrol - No it won't work. Suggestion would be to use env in these cases as in option 2 (no need for the old syntax with the cat <<-END_VERSIONS stuff). But there are relatively few of these non-bash modules, none in the rnaseq for example I think.

pinin4fjords · 2023-12-11T16:03:42Z

rnaseq has a couple of R modules actually, they're just not obvious because they're local- and we will hopefully fix that at some point, and they will then need templates etc.

bentsherman · 2023-12-11T23:01:56Z

Thank you all for your feedback. I still prefer env myself, but Paolo is determined now to add the cmd type, so we will have both and you can use whichever one you prefer.

My preference goes to version2, which I find more explicit, easier to read, but I do love the version3 that removes completely the version generation from the script itself.

Note that the cmd type is still executed in the task script just like an env, it just inserts the command for you

I like three, my only concern is some of the commands to get the version get pretty long.

@emiller88 I don't think you can reference local variables in an output as in your example, but you could reference a global variable, for example:

foo_version = 'really | long | version | command'

process foo {
  output:
  cmd("${foo_version}")
}

But I wonder about modules with R or python scripts, where we use those languages to create the versions.yml instead of bash. Will this work? or do we have to continue using the old syntax for these cases?

@mirpedrol In this PR I changed all the processes to only emit the metadata and then the YAML is constructed at the end of the pipeline. If you usually generate the tool version from within a Python or R script, the cmd output could do something like python script.py --version to retrieve the version from Bash. If the process script itself is not Bash, however, then the cmd output won't work. So whenever cmd isn't supported or would be unwieldy to use, you can always fallback to an env

note that one of the limitations of cmd (which will be documented) is that it doesn't support newlines

You could have a multi-line command by using semi-colons for newlines 😅

Regarding multi-line outputs, we found a way to support them for both env and cmd. So whereas currently env outputs are squashed to a single line, both will support multi-line output going forward.

ewels · 2023-12-12T09:47:08Z

So I think everyone agrees that options 2 + 3 are both improvements ✅

For any processes with script blocks written in languages other than bash, we will have to use the env approach. For bash commands I see now three options, which maybe we can vote on in the nf-core Slack:

Option 1: `env`

// env output
process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), env(FOO_VERSION), topic: versions
    tuple val("${task.process}"), val('bar'), env(BAR_VERSION), topic: versions

    """
    # ...

    FOO_VERSION=\$(foo --version)
    BAR_VERSION=\$(bar --version)
    """
}

Option 2: `cmd`

// cmd output
process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), cmd('foo --version'), topic: versions
    tuple val("${task.process}"), val('bar'), cmd('bar --version'), topic: versions

    """
    # ...
    """
}

Option 3: `cmd` + variable

// cmd + variable output
foo_version = 'foo --version'
bar_version = 'bar --version'
process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), cmd(foo_version), topic: versions
    tuple val("${task.process}"), val('bar'), cmd(bar_version), topic: versions

    """
    # ...
    """
}

edmundmiller · 2023-12-12T21:02:51Z

I think the real thing this could open up is parsing the version string in groovy as another option

// cmd + variable output
foo_version = getVersionFromString('foo --version')
bar_version = 'bar --version'

process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), cmd(foo_version), topic: versions
}

in a lib far far away:

def getVersionFromString(String text) {
    def matcher = text =~ /v(\d+\.\d+\.\d+)/
    return matcher ? matcher[0][1] : null
}

Just a thought.

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman · 2023-12-12T21:49:40Z

The thing is that the command must be executed in the task environment, because Nextflow might not have access to the tool from outside the task.

You could just emit the raw output of the tool version command, remove the duplicates, and then parse the string in Groovy:

process FOOBAR {
    output:
    tuple val("${task.process}"), val('foo'), cmd('foo --version'), topic: versions
}

Channel.topic('versions') .map { process, tool, raw_version ->
    [ process, tool, getVersionFromString(tool, raw_version) ]
}

That comes down to whether you would rather parse the version with a Bash one-liner or Groovy code. Note that you have to write a custom parser for every tool, so putting it all in a lib far far away would break the modularity of your modules. Unless you have a way to "register" a parser from the module script.

bentsherman added 3 commits November 15, 2023 06:48

Use cmd output for tool versions

4e84d37

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

Merge branch 'dev' into cmd-output

8734ccb

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

Fix failing process

e68f451

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

This comment was marked as outdated.

Sign in to view

bentsherman mentioned this pull request Dec 6, 2023

Use channel topic for tool versions #1109

Open

Remove references to versions.yml in config

f79e09d

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

mahesh-panchal mentioned this pull request Apr 4, 2024

NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS (1)` terminated with an error exit status (1) #1103

Closed

bentsherman changed the title ~~Use cmd output for tool versions~~ Use eval output for tool versions Apr 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use eval output for tool versions #1115

Use eval output for tool versions #1115

bentsherman commented Nov 15, 2023

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Nov 15, 2023 •

edited

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

bentsherman commented Nov 27, 2023

ewels commented Dec 11, 2023 •

edited

maxulysse commented Dec 11, 2023

edmundmiller commented Dec 11, 2023

mirpedrol commented Dec 11, 2023

pinin4fjords commented Dec 11, 2023 •

edited

maxulysse commented Dec 11, 2023

ewels commented Dec 11, 2023

pinin4fjords commented Dec 11, 2023

ewels commented Dec 11, 2023 •

edited

pinin4fjords commented Dec 11, 2023 •

edited

bentsherman commented Dec 11, 2023

ewels commented Dec 12, 2023 •

edited

edmundmiller commented Dec 12, 2023

bentsherman commented Dec 12, 2023 •

edited

Use eval output for tool versions #1115

Are you sure you want to change the base?

Use eval output for tool versions #1115

Conversation

bentsherman commented Nov 15, 2023

PR checklist

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Nov 15, 2023 • edited

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

bentsherman commented Nov 27, 2023

ewels commented Dec 11, 2023 • edited

maxulysse commented Dec 11, 2023

edmundmiller commented Dec 11, 2023

mirpedrol commented Dec 11, 2023

pinin4fjords commented Dec 11, 2023 • edited

maxulysse commented Dec 11, 2023

ewels commented Dec 11, 2023

pinin4fjords commented Dec 11, 2023

ewels commented Dec 11, 2023 • edited

pinin4fjords commented Dec 11, 2023 • edited

bentsherman commented Dec 11, 2023

ewels commented Dec 12, 2023 • edited

Option 1: env

Option 2: cmd

Option 3: cmd + variable

edmundmiller commented Dec 12, 2023

bentsherman commented Dec 12, 2023 • edited

github-actions bot commented Nov 15, 2023 •

edited

`nf-core lint` overall result: Passed ✅ ⚠️

ewels commented Dec 11, 2023 •

edited

pinin4fjords commented Dec 11, 2023 •

edited

ewels commented Dec 11, 2023 •

edited

pinin4fjords commented Dec 11, 2023 •

edited

ewels commented Dec 12, 2023 •

edited

Option 1: `env`

Option 2: `cmd`

Option 3: `cmd` + variable

bentsherman commented Dec 12, 2023 •

edited