Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment: use pipeline mode #2707

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

experiment: use pipeline mode #2707

wants to merge 6 commits into from

Conversation

robx
Copy link
Contributor

@robx robx commented Mar 16, 2023

This pulls in an exploratory implementation of pipeline mode (#2295). The bulk of the change is in the supporting libraries:

  • postgresql-libpq is extended to wrap the libpq pipeline mode API
  • hasql is hacked to allow queueing up pipelined statements, with pipeline synchronization and result reading (ignoring) deferred until the next result is required

Some notes as to the state of this:

  • the hasql change is very much just a minimal hack to allow us to evaluate pipeline mode in postgrest -- it makes little sense as some kind of "pipeline mode for a hasql" feature
  • the implementation seems mostly correct, but error handling is at least a bit broken (we don't treat aborted pipelines quite right), and there's a decent chance that some failure scenarios actually mess up the connection state, though I haven't seen that
  • this includes running postgrest-loadtest with artificial latency #2682 to allow simulating slow postgres

@robx
Copy link
Contributor Author

robx commented Mar 16, 2023

Some performance results, using postgrest-loadtest.

pipeline=yes pipeline=no main branch
pgdelay=0 300 287 294
pgdelay=1ms 62 55 55
pgdelay=5ms pgrst_delay=5ms 17.3 14.5 14.5
pgdelay=1ms pgrst_delay=10ms 26.6 24.8 24.7
pgdelay=10ms 12.1 9.6 9.6
pgdelay=50ms 2.7 2.2 2.2
  • the number is request rate from loadtest output -- it isn't particularly stable, e.g. I wouldn't trust the 300 > 294 in the first row to signal an improvement. But the overall improvement between the columns seem consistent.
  • pipeline=yes/no is on this branch, with usePipeline set to True or False; main branch is with unmodified dependencies
  • command line is e.g. PGRST_BUILD_CABAL=1 PGDELAY=1ms PGRST_DELAY=10ms postgrest-loadtest
  • PGRST_BUILD_CABAL=1 says to build using postgrest-build for quicker iteration (there's a supporting change in this PR)

Regarding the results:

  • there's some overhead in hasql that comes with supporting pipeline mode; this seem to be noticeable as a slight performance cost in the undelayed scenario
  • as soon as there's a bit of a latency towards postgresql (regardless of how that latency compares to the http client latency), pipeline mode does seem to provide a measurable benefit

@robx robx mentioned this pull request Mar 16, 2023
@robx
Copy link
Contributor Author

robx commented Mar 16, 2023

The library changes:

The postgresql-libpq change is essentially good to go upstream, but I haven't filed it yet.

@wolfgangwalther
Copy link
Member

Very cool!

Comment on lines 239 to 244
usePipeline :: Bool
usePipeline = True

queuePipelineStatement :: params -> SQL.Statement params () -> SQL.Transaction ()
queuePipelineStatement params stmt =
if usePipeline then SQL.inTransaction $ Session.queuePipelineStatement params stmt
else SQL.statement params stmt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So awesome that is such a small change.

the implementation seems mostly correct, but error handling is at least a bit broken (we don't treat aborted pipelines quite right), and there's a decent chance that some failure scenarios actually mess up the connection state, though I haven't seen that

So since there could be some unknown unknowns(pipeline mode also new to me), I think we should make this configurable. Name could be db-pipeline-mode. True by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the large change is in hasql. And I don't think it's a particularly sensible implementation there API-wise so far.

But yes gating this by a configuration option makes sense.

@steve-chavez
Copy link
Member

Just FYI, I'm leaving the pgbench pipeline test on steve-chavez@682930d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants