Skip to content

Reads the output from CLI help commands, and generates machine readable schemas (CWL etc)

License

Notifications You must be signed in to change notification settings

aCLImatise/CliHelpParser

Repository files navigation

aCLImatise

DOI

For the full documentation, refer to the Github Pages Website.


aCLImatise is a Python library and command-line utility for parsing the help output of a command-line tool and then outputting a description of the tool in a more structured format, for example a Common Workflow Language tool definition.

Currently aCLImatise supports both CWL and WDL outputs, but other formats will be considered in the future, especially pull requests to support them.

Please also refer to The aCLImatise Base Camp, which is a database of pre-computed tool definitions generated by the aCLImatise parser. Most bioinformatics tools have a tool definition already generated in the Base Camp, so you may not need to run aCLImatise directly.

aCLImatise is now published in the journal Bioinformatics. You can read the application note here: https://doi.org/10.1093/bioinformatics/btaa1033. To cite aCLImatise, please use the citation generator provided by the journal.

Example

Lets say you want to create a CWL workflow containing the common Unix wc (word count) utility. Running wc --help returns:

Usage: wc [OPTION]... [FILE]...
  or: wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified.  A word is a non-zero-length sequence of
characters delimited by white space.

With no FILE, or when FILE is -, read standard input.

The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
  -c, --bytes            print the byte counts
  -m, --chars            print the character counts
  -l, --lines            print the newline counts
      --files0-from=F    read input from the files specified by
                           NUL-terminated names in file F;
                           If F is - then read names from standard input
  -L, --max-line-length  print the maximum display width
  -w, --words            print the word counts
      --help display this help and exit
      --version output version information and exit

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/wc>
or available locally via: info '(coreutils) wc invocation'

If you run aclimatise explore wc, which means "parse the wc command and all subcommands", you'll end up with the following files in your current directory:

  • wc.cwl
  • wc.wdl
  • wc.yml

These are representations of the command wc in 3 different formats. If you look at wc.wdl, you'll see that it contains a WDL-compatible tool definition for wc:

version 1.0
task Wc {
  input {
    Boolean bytes
    Boolean chars
    Boolean lines
    String files__from
    Boolean max_line_length
    Boolean words
  }
  command <<<
    wc \
      ~{true="--bytes" false="" bytes} \
      ~{true="--chars" false="" chars} \
      ~{true="--lines" false="" lines} \
      ~{if defined(files__from) then ("--files0-from " +  '"' + files__from + '"') else ""} \
      ~{true="--max-line-length" false="" max_line_length} \
      ~{true="--words" false="" words}
  >>>
}