Skip to content
CoffeeTableEspresso edited this page Dec 4, 2018 · 1 revision

This page will explain the basics of the DAGQL language. The EBNF included with the project shows all the valid forms a query can take. On this page, we will walk through a simple query and explain each part.

The query we will reference is

SELECT a.id, b.id FROM (
    SEARCH EDGES AS e(x, y) WHERE x.id % 2 = 0 AND y.id %2 = 0 LIMIT 10
) AS edge(a, b) TRAVERSE BY BREADTH

At the top level, we always have SELECT, followed by a comma-separated list of expressions (what we will display to the user), in this case a.id and b.id.

Next, we have the FROM clause, which is mandatory and specifies the sub-query. The sub-query can take 3 forms: EDGES, which iterates over all the edges in the graph, NODES, which iterates over all the nodes, or a SEARCH clause, which takes a sub-clause and can filter results from it. In this case the SEARCH clause is SEARCH EDGES AS e(x, y) WHERE x.id % 2 = 0 AND y.id %2 = 0 LIMIT 10. Note that the SEARCH clause must be surrounded by parentheses. We will return to the SEARCH clause later.

The next part of the main query is the AS clause, which binds the nodes and edges we are iterating over to names (so we can display results to the user based on the graph we are iterating over. In this case, the AS clause is AS edge(a, b). This means we bind the edge we are iterating over to edge, the start node to a, and the end node to b. There are two forms of AS clause: the form seen above, for when we are iterating over edges, and a second form (AS node) for when we iterate over nodes. Note if the sub-query iterates over edges, we cannot use the node version of the AS clause, and if the sub-query is iterating over the nodes, we cannot use the edge version. The AS clause is optional, but very useful.

The final piece of the top level query is the TRAVERSE BY clause. This specifies the order in which to traverse the graph. There are currently two options, DEPTH and BREADTH. In our example, this clause is TRAVERSE BY BREADTH. If the TRAVERSE BY clause is omitted, the graph is traversed by depth; that is omitting the TRAVERSE BY clause is the same as specifying TRAVERSE BY DEPTH.

We'll now return to the sub-query. The EDGES and NODES forms of the subquery have been discussed above. The last form is the SEARCH form. In our case, the sub-query is SEARCH EDGES AS e(x, y) WHERE x.id % 2 = 0 AND y.id %2 = 0 LIMIT 10. A SEARCH clause takes a sub-query (in our case EDGES, but it could be anything, including another SEARCH clause), an optional AS ... WHERE clause (in our case AS e(x, y) WHERE x.id % 2 = 0 AND y.id %2 = 0), and a LIMIT clause (in our case LIMIT 10).

The AS ... WHERE clause will bind names as specified in the AS clause above, then filter the results of the subquery based on the expression after WHERE (in this case the expression is x.id % 2 = 0 AND y.id %2 = 0).

The LIMIT clause specifies the maximum number of results to return. It is evaluated once, and may not use any name bindings (although it can evaluate arbitrary expression without name bindings, if desired). In our case the LIMIT clause is LIMIT 10, meaning return a maximum of 10 results.

Clone this wiki locally