Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: use CWL for portable tool and workflow definitions #73

Open
mr-c opened this issue Mar 4, 2017 · 4 comments
Open

proposal: use CWL for portable tool and workflow definitions #73

mr-c opened this issue Mar 4, 2017 · 4 comments

Comments

@mr-c
Copy link

mr-c commented Mar 4, 2017

Hello, I am Michael R. Crusoe, one of the co-founders of the Common Workflow Language project, its Community Engineer, and a former sysadmin/SRE.

I really like this project's ethos!

You may find a benefit from examining the standards from the Common Workflow Language project, they define an interface for running command line data analysis programs and the workflows made from them.

These standardized descriptions are portable and executable on a variety of workflow management systems that collectively support most any backend: local execution, various HPC & cloud interfaces. The use of a standard description decouples the execution from the description of the data analysis, calculation, or simulation.

I would also like to bring to your attention the http://researchobject.org/ methods and standards for representing an output of research (a figure, data tables, raw results, and others) along with attribution for all contributors, provenance for the data and code, and an abstract representation of the workflow used to create the output (regardless if CWL or a system specific approach was taken).

Cheers,

@ivotron
Copy link
Collaborator

ivotron commented Mar 8, 2017

Hi @mr-c, thanks a lot for reaching out.

We learned about CWL thanks to a reference from the Genomics folks here at UCSC (devs of Toil). I investigated a bit if we could express some of our workflows but at the time (~6 months ago) it didn't seem to be able to express loops and distributed tasks. I find that most of the experiments we do in distributed storage, analysis and data management systems are relatively simple and can be expressed using bash, Ansible and/or docker compose. These are simple benchmarking experiments which, when visualized in a DAG, would be something like 3 nodes without much complexity.

Having said that, the goal of Popper is to be tool-agnostic in order to abstract experiments, treat them as a black-box, and assume that they will get implemented using reproducibility-friendly tools/standards (such as CWL, researchobject, bash and Ansible). Since Popper is a methodology, one can use any tool as long as the researcher is comfortable creating scripts for it.

thanks!

@mr-c
Copy link
Author

mr-c commented Mar 14, 2017

You are welcome @ivotron , glad to hear that you've already thought about this. Cheers!

@mr-c mr-c closed this as completed Mar 14, 2017
@mr-c
Copy link
Author

mr-c commented Dec 1, 2020

Dear @ivotron

In https://conferences.computer.org/scwpub/pdfs/CANOPIE-HPC2020-6GN8joymhwMWpK4pB3hqMl/306200a008/306200a008.pdf / 10.1109/CANOPIEHPC51917.2020.00007 it is written

Popper allows exporting a workflow to other workflow specification formats such as CWL

Can you point me to any code or documentation about that?

@mr-c mr-c reopened this Dec 1, 2020
@ivotron
Copy link
Collaborator

ivotron commented Dec 1, 2020

hi @mr-c, thanks a lot for reaching out!

We haven't documented the API yet. The code is available here. This WorkflowExporter class allows to create plugins to export popper workflows to other formats like CWL, and we are planning to implement exporter for CI services (circle, gitlab, and github actions).

We would love to have one exporter for CWL! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants