Skip to content

These are the notes I've taken while going through Neo4j's introduction course. If you find them useful, please leave a star. If you have any questions, feel free to open an issue. =)

Notifications You must be signed in to change notification settings

IamFlowZ/neo4j-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

My Notes from Neo4j Graph Academy

Neo4j Reference Card

Table of Contents

  1. Cypher
  2. Querying
  3. Creation
  4. Parameters
  5. Analysis
  6. Constraints
  7. Indexes
  8. Importing

Cypher

  • Uses ascii art to describe intent to the database
  • () = node
  • [] = relationship
    • "-" = edge
    • -> or <- = directed relationship
  • Labels are denoted by a colon followed by a string (:LABEL) or [:LABEL]
  • Properties are surrounded by curly brackets (:LABEL {prop:val})
  • Assign variables to nodes before labels "(a:LABEL)" or before patterns "a=()--()"
  • (a:Person{name:"John"}) - [:FOLLOWS] -> (b) = a is related to b
  • (a:Person) <- [:FOLLOWS] - (b) = b is related to a
  • (a:Person) <- [:FOLLOWS] -> (b) = a and b are both related
  • Use "MATCH" to search for a pattern
  • RETURN to return patterns or properties
  • MATCH (a:Person{name:"John"}) - [:FOLLOWS] -> (b) RETURN a.name, a.height

Querying

  • Use "WHERE" to filter results
    • WHERE a.age >= 22
  • You can use multiple matches in a single query
    • MATCH (a:Person) - [:KNOWS] - (b:Person {name:"Jack"}), (a) - [:KNOWS] - (c:Person {name:"Cindy"})
  • You can query for varying path lengths with [:LABEL*N]
    • Where "N" is the depth that you would like to traverse
    • You can omit "N" to traverse all relationships of "LABEL"
    • You can also specify a length range with ".." (N..O)
    • [:KNOWS*1..2]
      • friend of a friend
    • You can replace "N" with "*" to find the shortest path
  • Tons of aggregation functions
    • Don't need to specify grouping key, any non-aggregated keys, becoming grouping keys after all aggregation is completed.
  • Use "WITH" to do intermediate filtering
    • WITH a, count(a) as numMutuals
    • For example, the initial match pattern may be very general, but you have some range of values in mind. Use this to filter down to those ranges.
    • Variables created in the original match are no longer accessible after the match clause
  • Use "DISTINCT" to remove duplicate values in lists
    • Can be paired with "WITH"
    • WITH DISTINCT a
  • Use "ORDER BY" to sort results
    • ORDER BY a.age DESC
  • Use "LIMIT" to limit the number of results
    • LIMIT N
  • "count()" has two forms
    • "count(*)" returns the number of rows a pattern resolves to
    • "count(a)" returns the number of non-null instances of an expression

Creating & Updating

  • Use "CREATE" followed by a pattern to store new information in the database
  • If creating patterns where nodes already exist, "MATCH" them first, then fill in the blanks
    • MATCH (a:Person{name:"Tom"}), (b:Person{name:"Ken}) CREATE (a) - [:KNOWS] -> (b) RETURN a, b;
  • You can add additional labels to nodes
    • MATCH (m:Movie) SET m:Action
  • Remove labels with REMOVE
    • REMOVE m:Action
  • You can use JSON syntax to set properties on a node
    • SET x = {prop1: val1, prop2: val2}
    • SET x += {prop3: val3}
  • Set individual properties
    • SET x.name="kyle"
  • "MERGE" Should be used carefully
    • Can lead to duplicated data if not used correctly
    • Both modifies and creates data depending on if the query identifies the node.
  • Use "ON CREATE " to modify new nodes while merging.
  • Use "ON MATCH " to modify already existing nodes.
  • When using merge to create relationships, you must first match the nodes to be related, otherwise the nodes are created.

Parameters

  • In production, you should never use hard coded values. You should have the statement hard-coded and pass values in as parameters.
  • Use "$" followed by a name to parameterize values
    • set a.name = $personName
  • Use ":params" directive in the neo4j GUI to set parameters
    • :params {personName: "Scott"}
    • View the currently set parameters with :params
    • remove a value by resubmitting the params object without that key/value
    • Clear the parameters object by passing an empty object :params {}

Analysis

  • Two main commands to analyze query execution
    • "EXPLAIN" - estimates the amount of processing that might occur
    • "PROFILE" - provides information about how the engine behaved during the query and executes the query
  • Neo4j keeps a cache of common nodes & relationships, the more specific you are with the nodes you're looking for, the more likely that it will hit the cache. Using "PROFILE" to monitor cache vs DB hits can be a easy way to improve query performance.
  • If a query is executed and returns a lot of data, there is no way to monitor or kill it while it's returning. This can have significant performance costs.
  • If a query is long-running but hasn't returned, you can monitor it with :queries
  • You can then kill it with CALL dbms.killQuery('query-id') or by clicking in the GUI

Constraints

  • Easy way to prevent duplication is through constraints
  • Mainly used to guarantee uniqueness
  • If any nodes/relationships fail to meet the constraint when the constraint is being created, the constraint will not be created.
  • Unique property for a given label
    • CREATE CONSTRAINT on (a:Person) ASSERT a.name is UNIQUE
  • Ensure a property exists
    • CREATE CONSTRAINT on ()-[a:CREATED]-() ASSERT exists(a.date)
  • Remove constraints with "DROP"
    • DROP CONSTRAINT on ()-[a:CREATED]-() ASSERT exists(a.date)
  • Use Node Keys to define multiple unique properties for a node label. Internally is used as a composite index for the label
    • CREATE CONSTRAINT on (a:Person) ASSERT (a.name, a.born) IS NODE KEY

Indexes

  • Indexes improve read speed at the cost of write-speed and storage size. Indices store redundant data.
  • Unlike SQL, no primary keys. Can have multiple required, unique properties
  • Single property indices are used for:
    • Equality checks "="
    • Range comparisons ">, <, <=, >="
    • List membership "IN"
    • String comparisons "STARTS WITH, ENDS WITH, CONTAINS"
    • Existence checks "exists()"
    • Spatial distance searches "distance()"
    • Spatial bounding searches "point()"
  • Composite indices are used for:
    • Equality checks
    • List membership
  • For large graphs, it is recommended to create indices AFTER data is loaded.
  • Uniqueness constraints == indices, indices != uniqueness constraints
  • Use "CREATE INDEX" to create indices for labels
    • CREATE INDEX ON :Movie(released)
  • Create composite indices by listing properties in the query
    • CREATE INDEX ON :Movie(released, format)
    • This creates a second combined index. This and the previous index would both exist.
  • List indices with CALL db.indexes()
  • Delete indices with "DROP INDEX"
    • DROP INDEX ON :Movie(released, format)

Importing

  • The most common way to import data into Neo4j is via .csv files.
  • Use "LOAD CSV" to pull normalized csv data into Neo4j
    • parses a file in the import directory of the Neo4j installation
    • LOAD CSV WITH HEADERS FROM url-value AS row
    • You can specify the csv delimiter with "FIELDTERMINATOR"
      • FIELDTERMINATOR ';'
  • If you're importing >10k lines you should use "PERIODIC COMMIT" to avoid mishaps or performance hits.

About

These are the notes I've taken while going through Neo4j's introduction course. If you find them useful, please leave a star. If you have any questions, feel free to open an issue. =)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published