Language agnostic Documentation extractor
License
cehteh/pipadoc
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
pipadoc - Documentation extractor ================================= :author: Christian Thaeter :email: ct@pipapo.org :date: 16. September 2020 [preface] Introduction ------------ Embedding documentation in program source files often results in the problem that the structure of a program is not the optimal structure for the associated documentation. Still there are many good reasons to maintain documentation together with the source right within the code which defines the documented functionality. Pipadoc addresses this problem by extracting special comments out of a source file and let one define rules how to compile the documentation into proper order. This is somewhat similar to ``literate Programming'' but it puts the emphasis back to the code. Pipadoc is programming language and documentation system agnostic, all it requires is that the programming language has some form of comments starting with a defined character sequence and spanning to the end of the source line. Moreover documentation parts can be written in plain text files aside from the sources. Getting the Source ------------------ Pipadoc is managed the git revision control system. You can clone the repository with git clone --depth 1 git://git.pipapo.org/pipadoc Pipadoc is developed in the 'devel' and feature branches using rolling releases. Whenever stability is reached things get pushed to the 'master' branch. In few cases this may include backward incompatible changes. When upgrading one should check and eventually fix resulting issues. Installation ------------ Pipadoc is single Lua source file `pipadoc.lua` which is portable among most Lua versions (PUC Lua 5.1, 5.2, 5.3, 5.4, Luajit, Ravi). It ships with a `pipadoc.install` shell script which figures a suitable Lua version out and installs `pipadoc.lua` as `pipadoc` in a given directory (the current directory by default). There are different ways how this can be used in a project: - When a installed Lua version is known from the build tool chain one can include the `pipadoc.lua` into the project and call it with the known Lua interpreter. - One can rely on a pipadoc installed in '$PATH' and just call that from the build tool chain - One could ship the `pipadoc.lua` and `pipadoc.install` and do a local install in the build directory and use this pipadoc thereafter. Usage ----- Pipadoc is called with options and all input files. It may read a configuration file. When generating output it is either send to _stdout_ or saved as a given output file. ..... pipadoc [options...] [inputs..] options are: -v, --verbose increment verbosity level -q, --quiet suppresses any messages -d, --debug set verbosity to debug level one additional -v enables tracing -n, --dry-run do not generate output -h, --help show this help -r, --register <name> <file> <comment> register a filetype pattern for files matching a file pattern -t, --toplevel <name> sets 'name' as toplevel node [MAIN] -c, --config <name> selects a config file [pipadoc_config.lua] --no-defaults disables default filetypes and configfile loading --list-sections Parses input and lists all sections on stdout appended with '[section]', '[keys]' or [section keys]' depending on the contents, includes '--dry-run' -m, --markup <name> selects the markup engine for the output [text] -o, --output <file> writes output to 'file' [stdout] -a, --alias <pattern> <as> aliases filenames to another filetype. for example, treat .install files as shell files: --alias '(.*)%.install' '%1.sh' -D, --define <name>[=<value>] define a GLOBAL variable to value or 'true' -D, --define -<name> undefine a GLOBAL variable -- stops parsing the options and treats each following argument as input file inputs are file names or a '-' that indicates standard input ..... Basic concepts -------------- Pipadoc is controlled by special line comments. This is chosen because it is the most common denominator between almost all programming languages. To make a line comment recognized as pipadoc comment it needs to be followed immediately by a operator sequence. Which in the simplest case is just a single punctuation character. These comments are stored to named documentation sections. One of the main concepts of pipadoc is that one can define which and in what order these sections appear in the final output. To add special functionality and extend the semantics one can define pre and post processors. Preprocessors are defined per input programming language and process all source lines. They can modify the source arbitrary before pipadoc does the parsing. This allows to generate completely new content. Lift parts on the source code side over to the documentation and generate additional information such as indices and glossaries. Postprocessors are defined for output markup and process every line in output order. The allow to augment output in a markup specific way. There is a string substitution/template engine which can processes text. The string substitution engine also implements a small lisp-like programming language which can be used to generate content programmatically. Syntax ------ .In short: A pipadoc comment is any 'line-comment' of the programming language directly (without spaces) followed by a optional alphanumeric section name which may have a sorting key appended by a dot, followed by an operator, followed by an optional argument. Possibly followed by the documentation text itself. Only lines qualify this syntax are processed as pipadoc documentation. Preprocessors run before the parsing is done and may translate otherwise non pipadoc sourcecode into documentation. .The formal syntax looks like: .... <pipadoc> ::= [source] <linecomment> <opspec> [ <space> [documentationtext]] <source> ::= <any source code text> <linecomment> ::= <the filetypes linecomment sequence> <opspec> ::= [section['.'key]] <operator> [argument] <section> ::= <alphanumeric text including underscore> <operator> ::= ":" | "+" | "=" | "{" | "}" | "@" | "$" | "#" | "!" | <user defined operators> <key> ::= <alphanumeric text including underscore> <argument> ::= <alphanumeric text including underscore> <documentationtext> ::= <rest of the line> .... Sections and Keys ----------------- Text in pipadoc is stored in named 'sections' and can be associated with some additional alphanumeric key under that section. This enables later sorting for indices and glossaries. One-line sections are defined when a section and maybe a key is followed by documentation text. One-line doctext can be interleaved into Blocks. At the start if a input file the default block section name is made from the files name up to the first dot. Sections are later brought into the desired order by pasting them into a 'toplevel' section. This default name for the 'toplevel' section is 'MAIN_{markup}' or if that does not exist just 'MAIN'. Order of operations ------------------- Pipadoc reads all files line by line. and processes them in the following order: Preprocessing :: Preprocessors are Lua functions who may alter the entire content of a line before any further processing. They get a 'context' table passed in with the 'SOURCE' member containing the line read from the input file. Parsing :: The line is broken down into its components and the operators processing function will be called. The ':' and '+' operators do a first string substitution pass to expand variables. This string substitution is done in input order. String substitution macros may leverage this for additional state and may generate extra content like indices and append that to the respective sections. Output Ordering :: The output order is generated by assembling the '{toplevel}_{markup}' or if that does not exist the '{toplevel}' section. The paste and sorting operators there define the section order of the document. The conditional operators '{' and '}' are also evaluated at this stage and may omit some blocks depending on selection predicates. Postprocessing :: For each output context the postprocessors run in output order. Finally a last string substitution pass is applied in output order. This pass can generate markup specific changes. Writeout :: The finished document is written to the output. Report empty sections, Orphans and Doubletes:: Pipadoc keeps stats on how each section was used. Finally it gives a report (as warning) on sections which appear to be unused or used more than once. These warnings may be ok, but sometimes they give useful hints about typing errors. To suppress such reports of intentional left out sections one can use the '!' operator. It is important to know that reading happens only line by line, operations can not span lines. While Processing steps can be stateful and thus preserve information for further processing. Filetypes --------- Pipadoc needs to know about the syntax of line comments of the files it is reading. For this patterns are registered to be matched against the file name together with a list of line comment characters. Definitions for a common programming languages are already included. For languages that support block comments the opening (but not the closing) commenting characters are registered as well. This allows one to define section blocks right away. Note that using Markup Languages ---------------- The core of pipadoc is completely agnostic about the markup used within the documentation strings. The '--markup' option only sets the 'MARKUP' variable and output generation tries include the markup in the toplevel. Usually only string substitution and postprocessors should handle markup related things. The shipped configuration file comes with postprocessors for 'asciidoc' and 'text' markups. More will will be added in future (org-mode, markdown). Operators --------- Operators define how documentation comments are evaluated, they are the core functionality of pipadoc and mandatory in the pipadoc syntax to define a pipadoc comment line. It is possible to (re-)define operators. Operators must be a single punctuation character. `:` :: The documentation operator. Defines normal documentation text. Each pipadoc comment using the `:` operator is processed as documentation. Later when generating the toplevel Section is used to paste all other documentation in proper order together. `+` :: Concat operator. Like ':' but appends text at the last line instead creating a new line. Note that only the 'TEXT' is appended all other context information gets lost. `=` :: Section paste operator. Takes a section name as argument and will paste that section in place. `{` :: Conditional block start. Needs a string substitution predicate as ARG and its arguments (without the closing curly brace). Evaluated at 'output' times. When this predicate evaluates to *true* then the following documentation is included in the output, when *false* then the following output within this block is supressed. A conditional block must end with the matching closing curly brace operator. Conditional blocks can be nested. For possible predicates see <<Predicates,below>>. `}` :: Conditional output block end. Must match a preceeding block start. `@` :: Alphabetic sorting operator. Takes a section name as argument and will paste section text alphabetically sorted by its keys. `$` :: Generic sorting operator. Takes a section name as argument and will paste section text sorted by its keys. `#` :: Numerical sorting operator. Takes a section name as argument and will paste section text numerically sorted by its keys. `!` :: Section drop operator. Deletes the section given as argument at output time. Used to clean up orphan warnings for unused sections for certain toplevels. Processors ---------- Pre- and Post- processors are lua functions that allow to manipulate text programmatically. They get the full context of the current processed line as input and can return the modified line or choose to drop or keep the line. They are can freely store some state elsewhere and thus allow some processing that spans lines which is normally not available in pipadoc. Preprocessors ~~~~~~~~~~~~~ Preprocessors are per filetypes. A preprocessor can modify any input line prior it gets parsed and further processed. Preprocessors are used to autogenerate extra documentation comments from code. Lifting parts of the code to the documentation side. They operate on the whole source line, not only the pipadoc comment. This is the place to generate data for new sections (Gloassaries, Indices). Postprocessors ~~~~~~~~~~~~~~ Postprocessors run at output generation time. They are registered per markup type. They are used to augment the generated output with markup specific things. They operate only on parsed documentation comments and only those should be modified. All other data is already available and stored, thus they may not generate new sections. String Substitution Engine -------------------------- Documentation text is be passed to the string substitution engine which recursively substitutes macros within curly braces. The substitutions are taken from the passed context (and GLOBAL's). When a docline is entirely a single string substitution (starting and ending with a curly brace) and the string substitution resulting in an empty string, then the whole line becomes dopped. If this is not intended one could add a second empty '{NIL}' FOO string substitution to the line. A string substitution can be either a string or a Lua function which shall return the substituted text. * When the susbtitution is defined as string, then the argument passed as +__ARG__+ and the resulting string will become recursively evaluated by the engine. * When it is a function, then this function is responsible for calling recursive evaluation on its arguments and results. String Substitution Language ---------------------------- The string substitution engine comes with some macros predefined which implement a simple lisp-like programming language to allow conditional evaluation and (in future) other useful features. This language is enabled when assembling the output in order and evaluated in the last step of the postprocessor. Configuration File ------------------ Pipadocs main objective is to scrape documentation comments from a project and generate output in desired order. Such an basic approach would be insufficient for many common cases. Thus pipadoc has pre and postprocessors and the string substitution engine to generate and modify documentation in an extensible way. These are defined in an user supplied configuration file. Pipadoc tries to load the configuration file on startup. By default it is named +pipadoc_config.lua+ in the current directory. This name can be changed with the '--config' option. The configuration file is used to define pre- and post- processors, define states for those, define custom operators and string substitution macros. It is loaded and executed as it own chunk and may only access the global variables and call the API functions described below. Without a configuration file none of these processors are defined any only few variables for string substitution engine are set. External Libraries ~~~~~~~~~~~~~~~~~~ 'pipadoc' does not depend on any external Lua libraries. Nevertheless modules can be loaded optionally to augment the behavior and provide extra features. Plugin-writers should use the 'request()' function instead the Lua 'require()', falling back to simpler but usable functionality when some library is not available or call 'die()' when a reasonable fallback won't do it. Pipadoc already calls 'request "luarocks.loader"' to make rocks modules available. [appendix] GNU General Public License -------------------------- ---- pipadoc - Documentation extractor Copyright (C) Pipapo Project 2015, 2016, 2017, 2020 Christian Thaeter <ct@pipapo.org> This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. ----
About
Language agnostic Documentation extractor