Discussion: Identifying slow queries at build time #7

rattrayalex · 2021-04-17T14:15:37Z

I'm not sure yet how much of this involves writing a tool vs. just documenting a pattern or what, and I'm not sure to what extent it relates to this repo vs graphile-engine etc.

It would be awesome to write tests that, for each persisted operation, execute the query with EXPLAIN against the server while it points to a DB that represents production (perhaps actually being production) with representative variables.

Something maybe very loosely like this:

// in your app
const myQuery = gql`
  query myQuery($condition: String) 
    @explain(variables: {condition: "foo"}, maxcost: 1000)
    @explain(variables: {condition: "bar"}, maxcost: 5000) 
  {    
    getUsers(condition: $condition) {
      nodes { 
        username 
      }
    }
  }
`

This would generate several files:

.persisted-operations/client.json:

{
  "myQuery": "xyzsha",
}

.persisted-operations/xyzsha.graphql:

query myQuery($condition: String) {
  ...
}

.persisted-operations/xyzsha-myQuery-foo.sql:

-- query: ./xyzsha.graphql (make it easy to cmd+click to the gql in editors like vs code)
-- variable: condition=foo
select username from users where thing = 'foo';

/*
 EXPLAIN results here, showing the cost and query plan
*/

.persisted-operations/xyzsha-myQuery-bar.sql:

-- query: ./xyzsha.graphql
-- variable: condition=bar
select username from users where thing = 'bar';

/*
 EXPLAIN results here, showing the cost and query plan
*/

This way, developers can easily see the sql generated from the queries they write, as well as the query plan (imagine actually seeing where a Seq Scan is used!). Reviewers can audit it. And the build can fail if EXPLAIN thinks the cost of a query will be higher than the specified maxcost.

Maybe this could even be part of graphile pro.

Problems / open questions:

Of course, developers will have to be responsible for thinking of the appropriate variables to test, but that's a core skill to software development, and once something you didn't think of comes up, you can add it as a performance regression test.
Working with query planner costs can be tricky and sometimes the query planner is way off (or gives an unhelpfully wide range) but it is the best tool around.
Naming the files gets weird, you'd want an intuitive and browsable scheme, and ideally one that isn't hell in source control. Maybe specify a @name for each persisted operation instead of using a sha? Maybe what I have here would be fine-enough, and maybe even just using sha's would be fine.
~~Failing the build on a high EXPLAIN cost estimate could cause the build to sporadically fail, for example when the statistics change.~~
a. Silly me, this would only be the case if you re-ran the sql queries in CI, which isn't a good idea for both this reason and performance.
Should you EXPLAIN ANALYZE on non-mutation queries? Perhaps with a low statement_timeout to prevent dev iteration from taking down prod?
You probably don't want to be running a bunch of EXPLAIN ANALYZE's against production all the time, especially as you're iterating in dev.
a. This might not be nearly as bad as it sounds, though, because you'd only re-run it when the sha of a gql query changes, and you'll probably only be doing one or two at a time.
b. Of course, you can also just run this against a staging or secondary db with similar data to production, or use a tool to regularly sync representative/censored prod data to dev db's.
The generated query plans and costs for a given gql query will fall out of date over time, as the statistics in the db change, so it's possible a maxcost annotation would be silently exceeded for a time.
a. Probably a command to regenerate all sql files, manually run periodically, would be fine for this.
b. Substantial changes to query plans that affect performance should be caught by production observability tooling anyway. The thing is to not actively write and ship queries that are slow from the beginning.
Changing the backend implementation (for example, by wrapping a resolver) wouldn't automatically trigger regeneration of the SQL, you'd have to do that manually.

The text was updated successfully, but these errors were encountered:

rattrayalex mentioned this issue Apr 17, 2021

Prepared statements / debugging query performance? #6

Closed

benjie changed the title ~~Brainstorm: Identifying slow queries at build time~~ Discussion: Identifying slow queries at build time Apr 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Identifying slow queries at build time #7

Discussion: Identifying slow queries at build time #7

rattrayalex commented Apr 17, 2021 •

edited

Discussion: Identifying slow queries at build time #7

Discussion: Identifying slow queries at build time #7

Comments

rattrayalex commented Apr 17, 2021 • edited

rattrayalex commented Apr 17, 2021 •

edited