Draft RPQ CIP

opencypher · Feb 6, 2017 · 6a864b0 · 6a864b0
1 parent 3ea7838
commit 6a864b0
Showing 1 changed file with 160 additions and 0 deletions.
diff --git a/cip/1.accepted/CIP2017-02-06-Regular-Path-Patterns.adoc b/cip/1.accepted/CIP2017-02-06-Regular-Path-Patterns.adoc
@@ -0,0 +1,160 @@
+= CIP1017-02-06 Regular Path Patterns
+:numbered:
+:toc:
+:toc-placement: macro
+:source-highlighter: codemirror
+
+*Authors:* Tobias Lindaaker <tobias.lindaaker@neotechnology.com>
+
+toc::[]
+
+== Regular Path Patterns
+
+Above and beyond the types of patterns that can be expressed in Cypher using the normal path syntax, Cypher also supports what amounts to regular expressions over paths.
+This functionality is called Regular Path Patterns.
+
+A Regular Path Pattern is defined as:
+
+• A simple relationship type, or
+• A Regular Path Pattern followed by another Regular Path Pattern, or
+• An alternative between two Regular Path Patterns, or
+• A repetition of a Regular Path Pattern, or
+• A reference to a Defined Path Predicate.
+
+Regular Path Patterns are written similarly to how relationship patterns are written, but enclosed within two slash (`/`) characters instead of brackets (`[]`).
+
+Contrary to Relationship Patterns, Regular Path Patterns do _not_ allow binding a relationship to a variable.
+In order to bind the matching path to a variable, a Path Assignment should be used, by preceding the path with an identifier and an equals sign (`=`).
+This avoids a problem that existed in the past with repetition of relationships (a syntax that was deprecated with the introduction of Regular Path Patterns), where a relationship variable would bind to a list, making it hard to express predicates over the actual relationships.
+Predicates on parts of a Regular Path Pattern are instead expressed through the use of explicitly defined path predicates.
+
+=== Syntax
+
+The syntax of Regular Path Patterns fit into the greater Cypher syntax through `PatternElementChain`.
+
+----
+PatternElementChain = (RelationshipPattern | RegularPathPattern), NodePattern ;
+
+RegularPathPattern = (LeftArrowHead, Dash, '/', [RegularPathExpression], '/', Dash, RightArrowHead)
+                   | (LeftArrowHead, Dash, '/', [RegularPathExpression], '/', Dash)
+                   | (Dash, '/', [RegularPathExpression], '/', Dash, RightArrowHead)
+                   | (Dash, '/', [RegularPathExpression], '/', Dash)
+                   ;
+RegularPathExpression = {RegPathOr}- ;
+RegPathOr = RegPathSeq, {'|', RegPathSeq} ;
+RegPathSeq = {RegPathStar}- ;
+RegPathStar = RegPathDirected [('*', [RangeLiteral]) | '+'] ;
+RegPathDirected = ['<'], RegPathBase, ['>'] ;
+RegPathBase = RegPathRelationship
+            | RegPathReference
+            | '(' RegularPathExpression ')'
+            ;
+RegPathRelationship = RelType ;
+RegPathReference = SymbolicName ;
+----
+
+The `RegPathReference` is a reference to a Defined Path Predicate.
+These are defined using the following syntax:
+
+----
+DefinedPathPredicate = PathPredicatePrototype, 'IS', Pattern, [Where] ;
+PathPredicatePrototype = '(', Variable, ')', RegPathPrototype, '(', Variable, ')' ;
+RegPathPrototype = (LeftArrowHead, Dash, '/', DefinedPathName, '/', Dash)
+                   | (Dash, '/', DefinedPathName, '/', Dash, RightArrowHead)
+                   | (Dash, '/', DefinedPathName, '/', Dash)
+                   ;
+DefinedPathName = SymbolicName ;
+----
+
+=== Examples
+
+The astute reader of the syntax will have noticed that it is possible to express a Regular Path Pattern with an empty path expression:
+
+[source, cypher]
+----
+MATCH (a)-//-(b)
+----
+
+This pattern simply states that `a` and `b` must be the same node, and is thus the same as:
+
+[source, cypher]
+----
+MATCH (a), (b) WHERE a = b
+----
+
+The same reader will also have noticed that it is possible to define a pattern containing just a relationship type:
+
+[source, cypher]
+----
+MATCH (a)-/:KNOWS/->(b)
+----
+
+That pattern is indeed equivalent to the very similar relationship pattern:
+
+[source, cypher]
+----
+MATCH (a)-[:KNOWS]->(b)
+----
+
+The main difference being that the variant with a relationship pattern is able to bind that relationship and express further predicates over it.
+
+The Regular Path Patterns start becoming interesting when larger expressions are put together:
+
+[source, cypher]
+.Finding someone loved by someone hated by someone you know, transitively
+----
+MATCH (you)-/(:KNOWS :HATES)+ :LOVES/->(someone)
+----
+
+Note the `+` expressing one or more occurrences of the sequence `KNOWS` followed by `HATES`.
+
+The direction of each relationship is governed by the overall direction of the Regular Path Pattern.
+It is however possible to explicitly define the direction for a particular part of the pattern.
+This is done by either prefixing that part with `<` for a right-to-left direction or suffixing it with `>` for a left-to-right direction.
+It is possible to both prefix the part with `<` and suffixing it with `>`, giving that part the interpretation of being undirected.
+
+[source, cypher]
+.Specifying the direction for different parts of the pattern
+----
+MATCH (you)-/(:KNOWS <:HATES)+ :LOVES/->(someone)
+----
+
+In the example above we say that the `HATES` relationships should have the opposite direction to the other relationships in the path.
+
+Through the use of Defined Path Predicates we can express even more predicates over a path:
+
+[source, cypher]
+.Find a chain of unreciprocated lovers
+----
+MATCH (you)-/unreciprocated_love*/->(someone)
+PATH (a)-/unreciprocated_love/->(b) IS
+     (a)-[:LOVES]->(b)
+     WHERE NOT EXISTS { (b)-[:LOVES]->(a) }
+----
+
+Note how there is no colon used for referencing the Defined Path Predicate, the colon is used in Regular Path Patterns only for referencing actual relationship types.
+
+Sometimes it will be interesting to express a predicate on a node in a Regular Path Pattern.
+This can be achieved by using a Defined Path Predicate where the nodes on both ends are the same:
+
+[source, cypher]
+.Find friends of friends that are not haters
+----
+MATCH (you)-/:KNOWS not_a_hater :KNOWS/-(friendly_friend_of_friend)
+PATH (x)-/not_a_hater/-(x) IS (x)
+     WHERE NOT EXISTS { (x)-[:HATES]->() }
+----
+
+In the case of a Defined Path Predicate where both nodes are the same, the direction of the predicate is irrelevant.
+In general the direction of a Defined Path Predicate is quite important, and used for mapping the pattern in the predicate into the Regular Path Patterns that reference it.
+The only cases where it is allowed to omit the direction of a Defined Path Predicate is when the defined predicate is reflexive.
+This is obviously the case when both nodes are the same, but it would also be the case when the internal pattern is symmetrical, such as in the following example:
+
+[source, cypher]
+.Find chains of co-authorship
+----
+MATCH (you)-/co_author*/-(someone)
+PATH (a)-/co_author/-(b) IS
+     (a)-[:AUTHORED]->(:Book)<-[:AUTHORED]-(b)
+     WHERE a <> b
+----