Skip to content
This repository has been archived by the owner on Sep 3, 2021. It is now read-only.

Query performance issue if filtering on paths with more than one relation ()-[r1]->()<-[r2]-(n { filter: ... } ) #574

Open
lfritsche opened this issue Jan 17, 2021 · 2 comments

Comments

@lfritsche
Copy link

Hello,

I'm facing performance issues when filtering on the end-node of a relationship.

Given the following path:

(u:User)-[:RATES]->(m:Movie)<-[:PLAYED_IN]-(a:Actor)

If I want to get all Actors that match certain criteria I'm using a query as follows:

query getFilteredActors(
  $userID: Int!
  $roles: [String!]
  $name: String
  $age: Int
) {
  User( userID: $userID ) {
    movieRatings {
      rating
      Movie {
        movieID
        title
        actors(
          filter: {
            role_in: $roles
            Actor: { AND: [{ name_contains: $name }, { age_gte: $age }] }
          }
        ) {
          role
          Actor {
            name
          }
        }
      }
    }
  }
}

Querying ~2000 end nodes without filter (a:Actor) takes around 100ms.
If I add the filter "Actor: {...}" to the actors relationship field the total query time raises up to 30 seconds.
Right now I postprocess the unfiltered data to filter the result manually.

But am I doing something wrong here?
https://grandstack.io/docs/graphql-filtering/#nested-filter uses actors_in: ... to filter within the relationship end node. Is this approach different to my example above?

Thank you!

@lfritsche
Copy link
Author

If I start directly on the Movie node and apply the same filter, the query gets resolved immediately.
So somehow it seems like filtering on a second relationship leads to the performance issue.

I'm now querying the movies in a first query upfront and passing the resulting IDs to a second query:

query ActorsByMovieIDs(
  $movieIDs: [ID!]
  $roles: [String!]
  $name: String
  $age: Int
) {
  Movie(filter: { movieID_in: $movieIDs }) {
    movieID
    title
    actors(
      filter: {
        role_in: $roles
        Actor: { AND: [{ name_contains: $name }, { age_gte: $age }] }
      }
    ) {
      role
      Actor {
        name
      }
    }
  }
}

This time the total query time of both queries is below 200ms.
Any idea how I could improve the nested query in the complete example I used in the first place without the need to send two requests?

Thanks!

@lfritsche lfritsche changed the title [Question] Performance issue if filtering on relationship end node property [Question] Query performance issue if fitlering on paths with more than one relation ()-[r1]->()<-[r2]-(n { filter: ... } ) Jan 17, 2021
@lfritsche lfritsche changed the title [Question] Query performance issue if fitlering on paths with more than one relation ()-[r1]->()<-[r2]-(n { filter: ... } ) Query performance issue if filtering on paths with more than one relation ()-[r1]->()<-[r2]-(n { filter: ... } ) Jan 17, 2021
@michaeldgraham
Copy link
Collaborator

#608

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants