Specifying Cypher constraints #166

Mats-SX · 2016-12-15T09:56:00Z

Specifies syntax for constraints and specifies three concrete constraints: node property uniqueness, node property existence, and relationship property existence.

CIP

petraselmer

Looks good! Some comments...

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

thobe · 2017-02-28T15:54:51Z

I thought the idea was that we were going to specify a syntax for constraints without specifying which particular constraints an implementation should support. The syntax definition here explicitly talks only about uniqueness-constraint and existence-constraint.

Mats-SX · 2017-03-01T13:28:08Z

CIP has now been reworked to try and fit the model discussed.

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

boggle · 2017-03-01T19:31:56Z

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

+* `UNIQUE` - Ensures that each row for a column must have a unique value.
+* `PRIMARY KEY` - A combination of a `NOT NULL` and `UNIQUE`. Ensures that a column (or a combination of two or more columns) has a unique identity, reducing the resources required to locate a specific record in a table.
+* `FOREIGN KEY` - Ensures the referential integrity of the data in one table matches values in another table.
+* `CHECK` - Ensures that the value in a column meets a specific condition


Maybe we should have a leading keyword like that for non-UNIQUE constraints as well?

if so, I would propose the word THAT, since it reads nicely...

CREATE CONSTRAINT FOR (x:Foo) REQUIRE THAT ...

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

thobe · 2017-03-07T08:52:56Z

PRIMARY KEY is not a helpful name for the concept it is used for describing.

On the high level there are two reasons for this:

The word "primary" does not mean anything that is helpful in this context.
The concept of primary keys carries with it a lot of associations from relational databases, many of which do not apply to the property graph model.
- it should be noted that for the relational database case the word "primary" does have relevant meaning.

Diving further into these reasons, starting with the word "primary":

It implies that there is such a thing as a "secondary" key as well. In a relational database a secondary key is any index on a table other than the primary key.
It implies that this key has higher importance than any other key. While this might be true in many actual domain models, it is not always the case - in some cases there are other keys of equal importance.
It carries with it the association that there can be only one primary key. In many implementations - Neo4j being one of them - there is absolutely no need to enforce such a constraint on the ability to model your data.
It implies that the key needs to be defined first - before any data is inserted. In many implementation - Neo4j being one of them - this is not the case.

As for the aspects of primary keys in relational databases that do not apply to the property graph model:

The notion of there being a primary key implies that there might also be a foreign key - the idea of having foreign keys in a graph is quite silly, since we have direct relationships.
Coming from relational databases, I would expect preferential treatment for primary keys over any other (secondary) keys. I would expect lookup based on the primary key to be faster than any other key, since I would expect the data in that table to be structured by the primary key. In essence I would expect that leaf nodes in the index for the primary key being the actual row, with all of its data. Whereas for a secondary key the leaf node would only point to the actual row in the primary structure - there would be indirection for accessing the full data by a secondary key, thus penalizing access by secondary key.
- Again this would not be true in for example Neo4j, where every key is actually secondary.
Relational databases equate the primary key with identity, the property graph (at least in some implementations - for example Neo4j) has a separate notion of identity, and while the type of key we are proposing to add to the model would allow you to uniquely identify an entity, it would not necessarily identify that same entity forever - the same entity might change some of the values of the key and thus be identified by a different key, but still have the same identity.

The only reason I can think of for introducing the concept of a primary key in Cypher is for being able to map Cypher onto a relational database model. If that is the case I would much rather see this proposed from a vendor working on such a mapping, since they would have the insight into what needs to be modeled.

I do think that the notion of a unique indexed key of mandatory properties is helpful, and I see the benefit of elevating such a concept to the status of receiving its own syntax, but I don't think PRIMARY KEY is a good name for it.

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

thobe · 2017-03-07T09:03:36Z

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

+==== Mutability
+
+Once a constraint has been created, it may not be amended.
+Should a user wish to change its definition, it has to be dropped and recreated with an updated structure.


Should we have a note here that transactional implementations could do both the dropping and recreation in the same transaction so that the constraint is atomically mutated? This would of course allow leaving the old constraint in place should the creation of the new constraint fail.

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc

thobe · 2017-03-07T09:14:35Z

I wonder if the notion of unique key (that at the moment are erroneously called primary key) should really be a constraint, or if it should have its own syntax, something like:

CREATE KEY FOR (p:Person) AS p.name, p.address

The reason for this being that it actually implies multiple constraints, and typically also an index. Since it is a composed concept like that, perhaps it would be sensible to elevate it to being a syntactical concept of its own.

In the syntax for this, if accepted, we should allow an optional name for the key as well, just like we do for constraints.

Mats-SX · 2017-03-07T15:07:34Z

The CIP now uses ADD, NODE KEY and details a return record. I also took several review comments into account (thanks!).

- Rename `constrait-expr` to `constraint-predicate` - Limit scope of `UNIQUE` to single properties only - Update examples to reflect `PRIMARY KEY`

- Remove erroneous example for composing `NODE KEY` with `UNIQUE` and `exists()` - Rephrase example section to describe `NODE KEY` more accurately.

- Add missing case for when an error should be raised

Add test for DROP

For improved source readability

@hvub

- use same EBNF style as @hvub did in opencypher#493 - use links to connect to grammar constructs in openCypher grammar spec - modify constraint operators to be suffix-modeled, ala IS NOT NULL - introduce 'grouped-expression' concept

Extend definition to reference grouped expression

- Update definition to reference grouped expression - Reformulate equivalence example to use IS NOT NULL and IS UNIQUE over a grouped expression

Move ahead of referencing sections to ease readability of CIP

- Including links to these from the CIP - Also update examples for the parser tests

Mats-SX · 2021-07-02T12:48:46Z

This CIP has now been updated. See commit history for a summary of new models.

TCK tests cannot be added as the CIP mandates no particular concrete constraint predicates to be enforced by every implementation of Cypher. However, I will exemplify TCK scenarios that one could consider if one were to implement the NODE KEY constraint:

Feature: CreateConstraint1

  Scenario: [1] Blocking creation of nodes that do not conform to NODE KEY constraint
    Given an empty graph
    And having executed:
      """
      CREATE CONSTRAINT
      FOR (a:A)
      REQUIRE (a.x, a.y) IS NODE KEY
      """
    When executing query:
      """
      CREATE (a:A)
      SET a.x = 1
      """
    Then ConstraintValidationFailed should be raised at runtime: NodeKeyRequired

  Scenario: [2] Allowing creation of nodes that uphold NODE KEY constraint
    Given an empty graph
    And having executed:
      """
      CREATE CONSTRAINT
      FOR (a:A)
      REQUIRE (a.x, a.y) IS NODE KEY
      """
    When executing query:
      """
      CREATE (a:A)
      SET a.x = 1
      SET a.y = 2
      """
    Then the result should be empty
    And the side effects should be:
      | +nodes      | 1 |
      | +properties | 2 |

Mats-SX added CIP language feature labels Dec 15, 2016

Mats-SX force-pushed the constraints-cip branch from 6a73206 to 73f9909 Compare January 11, 2017 13:40

Mats-SX force-pushed the constraints-cip branch from 73f9909 to 1fd0cbf Compare January 20, 2017 09:36

thobe mentioned this pull request Jan 23, 2017

Support "acyclic" constraint on relationships #172

Open

IanRogers-LShift mentioned this pull request Jan 23, 2017

Support "unique" constraint on relationships #173

Open

Mats-SX force-pushed the constraints-cip branch from 1fd0cbf to 942570a Compare January 30, 2017 15:26

petraselmer suggested changes Feb 1, 2017

View reviewed changes

Mats-SX removed the language feature label Feb 3, 2017

petraselmer approved these changes Feb 15, 2017

View reviewed changes

boggle reviewed Mar 1, 2017

View reviewed changes

Mats-SX mentioned this pull request Mar 3, 2017

Add Neo4j index extension CIP #197

Closed

IanRogers reviewed Mar 4, 2017

View reviewed changes

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc Outdated Show resolved Hide resolved

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc Outdated Show resolved Hide resolved

Mats-SX force-pushed the constraints-cip branch 3 times, most recently from 4121971 to 22a7f30 Compare March 6, 2017 08:21

thobe reviewed Mar 7, 2017

View reviewed changes

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc Outdated Show resolved Hide resolved

thobe reviewed Mar 7, 2017

View reviewed changes

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc Outdated Show resolved Hide resolved

thobe reviewed Mar 7, 2017

View reviewed changes

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc Outdated Show resolved Hide resolved

thobe reviewed Mar 7, 2017

View reviewed changes

cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc Show resolved Hide resolved

thobe approved these changes Mar 30, 2017

View reviewed changes

Mats-SX force-pushed the constraints-cip branch from 32d1322 to 16c2843 Compare May 4, 2017 12:11

Mats-SX force-pushed the constraints-cip branch 2 times, most recently from 60c584f to 4ef7b32 Compare July 26, 2019 14:24

Mats-SX and others added 14 commits July 1, 2021 12:12

Introduce PRIMARY KEY constraint predicate

35bffe2

- Rename `constrait-expr` to `constraint-predicate` - Limit scope of `UNIQUE` to single properties only - Update examples to reflect `PRIMARY KEY`

Rename constraint operator to NODE KEY

29a278d

- Remove erroneous example for composing `NODE KEY` with `UNIQUE` and `exists()` - Rephrase example section to describe `NODE KEY` more accurately.

Use ADD for constraint creation

2b76cf1

- Add missing case for when an error should be raised

Add specification for the return record

6849976

Add tests verifying NODE KEY works in grammar

aabf685

Reformatted title

16fc677

Use CREATE instead of ADD

5f9acf5

Make textual clarifications

97ac3c9

Update grammar to use CREATE

e6ed668

Add test for DROP

Add missing semicolons in BNF syntax

4fe30ee

Specify limitations on constraint expressions

bc1c1bd

Add ability to specify multiple constraints in one statement

ba0491d

Add example for plural syntax

54c5db7

Explain usage of term 'constraint expression'

0cca4f6

Mats-SX force-pushed the constraints-cip branch from 85f2a88 to 0cca4f6 Compare July 1, 2021 10:13

Mats-SX added 11 commits July 1, 2021 12:15

Use two blank lines before new section

6f9e714

For improved source readability

Remove notion about creating multiple constraints

3e1acda

Update grammar definition

1f1be12

- use same EBNF style as @hvub did in opencypher#493 - use links to connect to grammar constructs in openCypher grammar spec - modify constraint operators to be suffix-modeled, ala IS NOT NULL - introduce 'grouped-expression' concept

Prefer element over entity

73fa12d

Update uniqueness to use IS UNIQUE operator

7acef69

Extend definition to reference grouped expression

Update node key to use IS NODE KEY operator

aa4a549

- Update definition to reference grouped expression - Reformulate equivalence example to use IS NOT NULL and IS UNIQUE over a grouped expression

Define a grouped expression and its type

056f6e5

Move ahead of referencing sections to ease readability of CIP

Prefer IS NOT NULL over exists()

09eb4b1

Stylistic improvements of examples

2843a65

Add analysis of discarded alternatives

03bca92

Update openCypher grammar description

26f22bb

- Including links to these from the CIP - Also update examples for the parser tests

Mats-SX force-pushed the constraints-cip branch from 4654832 to 26f22bb Compare July 2, 2021 12:39

Specify NODE and KEY as symbolic names, not reserved words

334fe95

Hunterness mentioned this pull request Aug 31, 2021

Update constraint syntax neo4j/neo4j-documentation#1238

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specifying Cypher constraints #166

Specifying Cypher constraints #166

Mats-SX commented Dec 15, 2016 •

edited

petraselmer left a comment

thobe commented Feb 28, 2017

Mats-SX commented Mar 1, 2017

boggle Mar 1, 2017

thobe Mar 7, 2017

thobe commented Mar 7, 2017

thobe Mar 7, 2017

thobe commented Mar 7, 2017

Mats-SX commented Mar 7, 2017

Mats-SX commented Jul 2, 2021

Specifying Cypher constraints #166

Are you sure you want to change the base?

Specifying Cypher constraints #166

Conversation

Mats-SX commented Dec 15, 2016 • edited

petraselmer left a comment

Choose a reason for hiding this comment

thobe commented Feb 28, 2017

Mats-SX commented Mar 1, 2017

boggle Mar 1, 2017

Choose a reason for hiding this comment

thobe Mar 7, 2017

Choose a reason for hiding this comment

thobe commented Mar 7, 2017

thobe Mar 7, 2017

Choose a reason for hiding this comment

thobe commented Mar 7, 2017

Mats-SX commented Mar 7, 2017

Mats-SX commented Jul 2, 2021

Mats-SX commented Dec 15, 2016 •

edited