Skip to content

Language Spec

Jordan Matelsky edited this page Sep 25, 2020 · 2 revisions

Hi! It's super unlikely that you actually want to be on this page, unless you're debugging weird parser-level silliness in the DotMotif codebase.

The following is an Extended Backus-Naur Form definition of the DotMotif language. The definitive version of this file lives here.

v2.0 (Current)

// See Extended Backus-Naur Form for more details.
start: comment_or_block+


// Contents may be either comment or block.
comment_or_block: block

// Comments are signified by a hash followed by anything. Any line that follows
// a comment hash is thrown away.

?comment        : "#" COMMENT

// A block may consist of either an edge ("A -> B") or a "macro", which is
// essentially an alias capability.
block           : edge
                | macro
                | macro_call
                | node_constraint
                | automorphism_notation
                | comment



// A macro can be considered a "function" definition that can be used in the
// rest of the file to define complex structure. For example,
//     foo(a, b) { a -> b }; foo(A, B);
// is the same as:
//     A -> B
// While this is a simple case, this is an immensely powerful tool.
macro           : variable "(" arglist ")" "{" macro_rules "}"

// A series of arguments to a macro
?arglist        : variable ("," variable)*

macro_rules     : macro_block+

?macro_block    : edge_macro
                | macro_call_re
                | comment
                | macro_node_constraint

// A "hypothetical" edge that forms a subgraph structure.
edge_macro      : node_id relation node_id
                | node_id relation node_id "[" macro_edge_clauses "]"



// A macro is called like a function: foo(args).
macro_call      : variable "(" arglist ")"
?macro_call_re  : variable "(" arglist ")"



// Edges are currently composed of a node, a relation, and a node. In other
// words, an arbitrary word, a relation between them, and then another node.
// Edges can also have optional edge attributes, delimited from the original
// structure with square brackets.
edge            : node_id relation node_id
                | node_id relation node_id "[" edge_clauses "]"

// A Node ID is any contiguous (that is, no whitespace) word.
?node_id        : variable

// A relation is a bipartite: The first character is an indication of whether
// the relation exists or not. The following characters indicate if a relation
// has a type other than the default, positive, and negative types offered
// by default.
relation        : relation_exist relation_type

// A "-" means the relation exists; "~" means the relation does not exist.
relation_exist  : "-"                               -> rel_exist
                | "~"                               -> rel_nexist
                | "!"                               -> rel_nexist

// The valid types of relation are single-character, except for the custom
// relation type which is user-defined and lives inside square brackets.
relation_type   : ">"                               -> rel_def
                | "+"                               -> rel_pos
                | "-"                               -> rel_neg
                | "|"                               -> rel_neg

// Edge attributes are separated from the main edge declaration with sqbrackets
edge_clauses            : edge_clause ("," edge_clause)*
macro_edge_clauses      : edge_clause ("," edge_clause)*

edge_clause             : key op value


// Node constraints:
node_constraint         : node_id "." key op value_or_quoted_value
                        | node_id "." key op node_id "." key

macro_node_constraint   : node_id "." key op value_or_quoted_value
                        | node_id "." key op node_id "." key


// Automorphism notation:
automorphism_notation: node_id "===" node_id

?value_or_quoted_value: WORD | NUMBER | DOUBLE_QUOTED_STRING


?key            : WORD | variable
?value          : WORD | NUMBER
?op             : OPERATOR | iter_ops


variable       : NAME

iter_ops        : "contains"                        -> iter_op_contains
                | "in"                              -> iter_op_in

NAME            : /[a-zA-Z_-]\w*/
OPERATOR        : /(?:[\!=\>\<][=]?)|(?:\<\>)/
VAR_SEP         : /[\_\-]/
COMMENT         : /#[^\n]+/
DOUBLE_QUOTED_STRING  : /"[^"]*"/
%ignore COMMENT

%import common.WORD -> WORD
%import common.SIGNED_NUMBER  -> NUMBER
%import common.WS
%ignore WS

v1.0 (October 23, 2018)

S -> T
T -> nERn   | TDT | c  | ε
R -> >      | +   | ~  | ?
E -> -      | !   | ?
D -> ;      | \n
n -> /\S+/
c -> /\#.*/