Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying an interface produces very slow query with a lot of UNION #5061

Open
Masadow opened this issue Apr 30, 2024 · 3 comments
Open

Querying an interface produces very slow query with a lot of UNION #5061

Masadow opened this issue Apr 30, 2024 · 3 comments

Comments

@Masadow
Copy link

Masadow commented Apr 30, 2024

Describe the bug
When querying an interface, neo4j generates a union for every subtypes of that interface resulting in very slow queries even if you don't request type conditions

Type definitions

interface Parent {
  id: ID
  label: String
}

type Child1 implements Parent {
  id: ID
  label: String
}

type Child2 implements Parent {
  id: ID
  label: String
}

...

type Child10 implements Parent {
  id: ID
  label: String
}

To Reproduce

Populate some nodes and relations and execute the graphql query :

query SlowQuery {
   parents(where: {id: "113"}) {
      id
      label
   }
}

Using

The generated query resembles to

CALL {
  MATCH (this0:Child1 {id: $param0})
  WITH this0 { .id, __resolveType: "Child1", __id: id(this0) } AS this0
  RETURN this0 AS this
  UNION
  MATCH (this1:Child2 {id: $param1})
  WITH this1 { .id, __resolveType: "Child2", __id: id(this1) } AS this1
  RETURN this1 AS this
  UNION
  ...
  MATCH (this9:Child10 {id: $param9})
  WITH this9 { .id, __resolveType: "Child10", __id: id(this9) } AS this9
  RETURN this9 AS this
}
WITH this
RETURN this as this

Execution plan with my real world schema for the generated query (structure is identical, just names changed) looks like :

plan

Expected behavior

This should be instantaneous since we're requesting a single node with a unique id filter however, it can takes up to several seconds to execute because of all the unions trying to successively clear for distinct values.

System (please complete the following information):

  • OS: reproduced on several kind of OS
  • Version: @neo4j/graphql@5.2.3
  • Node.js version: 20.x.x
@Masadow Masadow added the bug report Something isn't working label Apr 30, 2024
@neo4j-team-graphql neo4j-team-graphql added this to Bug reports in Bug Triage Apr 30, 2024
@neo4j-team-graphql
Copy link
Collaborator

Many thanks for raising this bug report @Masadow. 🐛 We will now attempt to reproduce the bug based on the steps you have provided.

Please ensure that you've provided the necessary information for a minimal reproduction, including but not limited to:

  • Type definitions
  • Resolvers
  • Query and/or Mutation (or multiple) needed to reproduce

If you have a support agreement with Neo4j, please link this GitHub issue to a new or existing Zendesk ticket.

Thanks again! 🙏

@Masadow Masadow changed the title Querying an interface is very slow Querying an interface produces very slow query with a lot of UNION Apr 30, 2024
@a-alle a-alle added performance and removed bug report Something isn't working labels Apr 30, 2024
@a-alle a-alle removed this from Bug reports in Bug Triage Apr 30, 2024
@dumitru-marian-barbu
Copy link

This is a big problem even when using interfaces in @relationship. If we use a field type that is an interface that is implemented by around 10 types, the unions generated take a lot to resolve even if there is only one actual node in the database corresponding tot that @relationship.

@angrykoala
Copy link
Member

I've been working on reproducing this issue with the following in Neo4j 5:

Typedefs

    interface Parent {
        id: String
        label: String
    }

    type Child1 implements Parent {
        id: String
        label: String
    }

    type Child2 implements Parent {
        id: String
        label: String
    }

    type Child3 implements Parent {
        id: String
        label: String
    }

    type Child4 implements Parent {
        id: String
        label: String
    }

    type Child5 implements Parent {
        id: String
        label: String
    }
    type Child6 implements Parent {
        id: String
        label: String
    }
    type Child7 implements Parent {
        id: String
        label: String
    }
    type Child8 implements Parent {
        id: String
        label: String
    }
    type Child9 implements Parent {
        id: String
        label: String
    }

    type Child10 implements Parent {
        id: String
        label: String
    }

Data

UNWIND range(1000) AS id
CREATE(:Child1 {id: id+"c1", label: "c1"})
CREATE(:Child2 {id: id+"c2", label: "c1"})
CREATE(:Child3 {id: id+"c3", label: "c1"})
CREATE(:Child4 {id: id+"c4", label: "c1"})
CREATE(:Child5 {id: id+"c5", label: "c1"})
CREATE(:Child6 {id: id+"c6", label: "c1"})
CREATE(:Child7 {id: id+"c7", label: "c1"})
CREATE(:Child8 {id: id+"c8", label: "c1"})
CREATE(:Child9 {id: id+"c9", label: "c1"})
CREATE(:Child10 {id: id+"c10", label: "c1"})
RETURN id

Query:

query SlowQuery {
   parents(where: {id: "200c2"}) {
      id
      label
   }
}

Is this setup accurate to your issue @Masadow ?

Trying with this setup, comparing to a query targeting the children element directly by its label (essentially the fastest way to get that element with GraphQL):

query FastQuery {
   child2s(where: {id: "202c2"}) {
      id
      label
   }
}

I noticed a difference of around 3x of the time to complete between the 2 versions for GraphQL (and ~6x in Cypher directly). Is that roughly the degradation that you experience or am I missing something in my setup that may make it worse?

It would help to know the scale of data you have roughly and what version of the database you are running on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants