Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot send arbitrary number of queries in a single call to read_sql #552

Open
ypsah opened this issue Nov 6, 2023 · 1 comment
Open
Labels
bug Something isn't working

Comments

@ypsah
Copy link

ypsah commented Nov 6, 2023

What language are you using?

Python

What version are you using?

0.3.2

What database are you using?

PostgreSQL (timescale) + pgbouncer

What dataframe are you using?

Arrow2 (should not matter)

Can you describe your bug?

My pgbouncer instance is configured to use a connection pool of 50 slots and to allow 500 client connections per user/database.

When using the session pooling mode of pgbouncer (1-1 client connection/server connection for the duration of a session). I cannot send more than 50 queries per call to cx.read_sql.

With the other pooling modes (transaction/statement, which only are an option since pgbouncer 1.21), that number climbs to 500.

Sending even one more query above these thresholds will result in a python exception being raised (pretty reliable except for session pooling which sometimes appears to just hang).

What are the steps to reproduce the behavior?

Run timescale + pgbouncer with either:

  • pool_mode set to session, and default_pool_size set to N (e.g. 50);
  • pgbouncer version 1.21+, max_prepared_statement set to some strictly positive value (e.g. 100), pool_mode set to transaction or statement, and max_client_conn set to N (e.g. 500).

Run the following snippet (replacing uri with the connection string for your pgbouncer instance and N with the value you configured pgbouncer with):

import connector as cx

cx.read_sql(uri, ["SELECT 1"] * N)  # OK
cx.read_sql(uri, ["SELECT 1"] * (N + 1))  # KO
Database setup if the error only happens on specific data or data type

N/A

Example query / code
import connector as cx

cx.read_sql(uri, ["SELECT 1"] * (N + 1))

What is the error?

In session pooling mode, either a hang, or an exception that mentions having hit query_wait_timeout (pgbouncer setting).

In the other pooling modes, an exception that mentions r2d2 failed to acquire enough connection slots.

@ypsah ypsah added the bug Something isn't working label Nov 6, 2023
@ypsah
Copy link
Author

ypsah commented Nov 6, 2023

It's quite easy to work around this limitation by using smaller query batches, so it's fine by me if connectorx keeps its current behaviour.

I'm merely surprised that connectorx uses r2d2 for connection pooling but actually requests one connection per partition. And maybe even more importantly, connectorx does not seem to release connections as queries complete (within the scope of a single call to read_sql).

Could you please comment on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant