Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent error when closing an unused cursor #405

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

evansd
Copy link

@evansd evansd commented Sep 4, 2023

SQLAlchemy will attempt to close cursor objects even if, for whatever reason, they haven't actually been used and therefore have no associated query to cancel.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(*) Release notes are required, with the following suggested text:

* Support cleanly closing unused cursor

SQLAlchemy will attempt to close cursor objects even if, for whatever
reason, they haven't actually been used and therefore have no associated
query to cancel.
@cla-bot
Copy link

cla-bot bot commented Sep 4, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks,

can you share some sequence of operations which leads this to happen? It'd be useful to have a regression test for this if possible.

@evansd
Copy link
Author

evansd commented Sep 4, 2023

Hi, this is slightly tricky to reproduce but I'm seeing it in the tracebacks from failures in CI when our test Trino instance goes down. It's something to do with SQLAlchemy's pool manager trying to close out connections after an error.

Here's the key bit of the traceback:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2199, in _safe_close_cursor
    cursor.close()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 712, in close
    self.cancel()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 706, in cancel
    raise trino.exceptions.OperationalError(
trino.exceptions.OperationalError: Cancel query failed; no running query

[info   ] failed after 3 attempts [trino.client]
[error  ] Error closing cursor [sqlalchemy.pool.impl.QueuePool]
And here's the full thing (click to expand)
Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 244, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1286, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1332, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1281, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1041, in _send_output
    self.send(msg)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 979, in send
    self.connect()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
           ^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=32769): Max retries exceeded with url: /v1/statement (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/sqlalchemy/dialect.py", line 373, in do_execute
    cursor.execute(statement, parameters)
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 573, in execute
    self._prepare_statement(operation, statement_name)
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 448, in _prepare_statement
    query.execute()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 806, in execute
    response = self._request.post(self._query, additional_http_headers)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 555, in post
    http_response = self._post(
                    ^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 899, in decorated
    raise error
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 886, in decorated
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=32769): Max retries exceeded with url: /v1/statement (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2199, in _safe_close_cursor
    cursor.close()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 712, in close
    self.cancel()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 706, in cancel
    raise trino.exceptions.OperationalError(
trino.exceptions.OperationalError: Cancel query failed; no running query

[info   ] failed after 3 attempts [trino.client]
[error  ] Error closing cursor [sqlalchemy.pool.impl.QueuePool]

@evansd
Copy link
Author

evansd commented Sep 4, 2023

Oh, interesting. Looking at the CI failures here there's an explicit test for the behaviour which my patch tries to remove:

def test_close_cursor(trino_connection):
cur = trino_connection.cursor()
cur.execute("SELECT * FROM tpch.sf1.customer")
cur.fetchone()
cur.close() # would raise an exception if cancel fails
cur = trino_connection.cursor()
with pytest.raises(Exception) as cancel_error:
cur.close()
assert "Cancel query failed; no running query" in str(cancel_error.value)

But I'm not sure that is the behaviour you actually want. The test was added in fca57d4 and looks like a direct copy-paste of the test for cancel():

def test_cancel_query(trino_connection):
cur = trino_connection.cursor()
cur.execute("SELECT * FROM tpch.sf1.customer")
cur.fetchone()
cur.cancel() # would raise an exception if cancel fails
cur = trino_connection.cursor()
with pytest.raises(Exception) as cancel_error:
cur.cancel()
assert "Cancel query failed; no running query" in str(cancel_error.value)

I think that explicitly calling cancel() on a cursor with no query should raise an an exception, but not so for the implicit cancellation that happens on close().

@hashhar
Copy link
Member

hashhar commented Sep 4, 2023

I think that explicitly calling cancel() on a cursor with no query should raise an an exception, but not so for the implicit cancellation that happens on close().

Yes, that is a fair point.

cc: @mdesmet @hovaesco PTAL as well.

@hashhar
Copy link
Member

hashhar commented Sep 4, 2023

In my head the logic should be:

  • On cursor.close()
    • Cancel any already executing queries if they exist
    • Perform implicit ROLLBACK if within a transaction
    • Set some boolean so that calling any method of the cursor leads to Error

We don't do steps 2 and 3 here (both required by DB-API).

  • On cursor.cancel() - this is not a DB-API specified method so the behaviour is up to us to specify
    • Cancel any already executing queries
      • If query already finished, return False
      • If query is running and was cancelled successfully, return True
      • If no query was executed at all (how to differentiate between this and already finished? - presence of query on the cursor?) then raise Exception

@hashhar
Copy link
Member

hashhar commented Sep 15, 2023

@evansd Me and @hovaesco discussed this offline and we agree with the changes described in my previous comment.

So can you please change cursor.cancel() behaviour to following:

  • If query is running and was cancelled successfully, return True
  • In every other case return False
    • e.g. query was already finished
    • e.g. no query was executed at all

And add some tests to verify this new behavior.

@hashhar
Copy link
Member

hashhar commented Oct 13, 2023

@evansd Any updates here?

@evansd
Copy link
Author

evansd commented Oct 13, 2023

Not yet, I'm afraid. This isn't a blocking issue for us at the moment as we've worked around the problem we had of the Trino instance going down during CI runs, so the behaviour in question no longer gets triggered.

I'm still planning to make the fix following the behaviour you've outlined; but we've got a lot of other high priority work on at the moment so it won't be for a few weeks.

@hashhar
Copy link
Member

hashhar commented Oct 13, 2023

Thanks for the update @evansd.

@hashhar
Copy link
Member

hashhar commented Dec 22, 2023

Some parts of this were addressed in #405

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants