Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I separate I/O from MetaData writes while Table autoloading, so that I can wrap a lock only around MetaData writes? #503

Open
radix opened this issue Nov 9, 2022 · 0 comments
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@radix
Copy link

radix commented Nov 9, 2022

TL;DR: while autoloading a Table, is there a way to separate the network I/O from the manipulation of the SQLAlchemy metadata, so I can wrap my lock only around the MetaData manipulation?

I autoload a bunch of Table objects in my app. Sometimes it seems that the BigQuery auth backend times out and this causes the Table loading to fail. (for whatever reason, this seems to mostly happen with auth and not actual connections to BigQuery. Maybe just because auth happens to have a really long timeout and it happens first?)

Also, I load these Table objects in a lock. This is necessary because the SQLAlchemy documentation says MetaData "may not be completely thread-safe" (whatever that means). And I have seen this first-hand, since we had a bug where Table metadata would be corrupted when trying to autoload them without a lock -- we would commonly get Tables that had no columns. So we introduced the lock.

The problem is, sometimes we also get these persistent timeouts with the google auth backend (below there is a stack trace created with faulthandler.dump_traceback() that shows this). Since the default timeout is 120 (and I haven't seen a way to override that other than monkey-patching, as an aside), this leads to a pileup of lock acquisition problems in my app, since all of this code runs implicitly in the Table() instantiaton.

What I would like is to be able to separate the network requests for authentication & fetching of table information from the manipulation of SQLAlchemy MetaData, so that I can wrap the lock acquisiton only around the part that updates SQLAlchemy MetaData. Why? Well, it's a best practice to only wrap locks around the code that needs to be serialized, and in this case it's to localize problems so only the threads that are actually having problems die; I don't want to kill all requests over the course of 2 minutes, I only want to kill the one request that happened to timeout.

I don't know if SQLAlchemy really makes this possible, but maybe there's a way to preload the data into a cache, or at least force authentication, before my call to Table()?

Environment details

  • Ubuntu 22.04.1
  • Python 3.9.15
  • pip 22.3
  • sqlalchemy-bigquery version: 1.4.4
  • sqlalchemy: 1.3.24

Steps to reproduce

  • autoload a bunch of Tables into the same MetaData, and do it in a lock because MetaData is not thread-safe for writes
  • do this a lot, in different threads, and wait for auth to timeout, leading to many threads getting lock timeout failures waiting on one thread to finish some network requests

Code example

from threading import Lock
import faulthandler

from sqlalchemy import Table

lock = Lock()

def load_table(tablename, sqlalchemy_meta, my_engine):
            try:
                with acquire_with_timeout(lock, 10):
                    sqlalchemy_table = Table(
                        tablename,
                        sqlalchemy_meta,
                        autoload=True,
                        autoload_with=my_engine
                    )
            except TimeoutError:
                faulthandler.dump_traceback()
                raise


# this is the definition of acquire_with_timeout:

from contextlib import contextmanager

@contextmanager
def acquire_with_timeout(lock, timeout):
    result = lock.acquire(timeout=timeout)
    if not result:
        raise TimeoutError(f"Timeout acquiring lock {lock!r}")
    try:
        yield
    finally:
        lock.release()

Stack trace

Here's the stack trace from the thread that seems to be stuck on creating a connection for auth (from faulthandler.dump_traceback(), so it doesn't have source code embedded)

Thread 0x00007f71c57fa640 (most recent call first):
  File "/venv/lib/python3.9/site-packages/urllib3/util/connection.py", line 74 in create_connection
  File "/venv/lib/python3.9/site-packages/urllib3/connection.py", line 159 in _new_conn
  File "/venv/lib/python3.9/site-packages/urllib3/connection.py", line 309 in connect
  File "/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 978 in _validate_conn
  File "/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 381 in _make_request
  File "/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 670 in urlopen
  File "/venv/lib/python3.9/site-packages/requests/adapters.py", line 439 in send
  File "/venv/lib/python3.9/site-packages/requests/sessions.py", line 643 in send
  File "/venv/lib/python3.9/site-packages/requests/sessions.py", line 530 in request
  File "/venv/lib/python3.9/site-packages/google/auth/transport/requests.py", line 482 in request
  File "/venv/lib/python3.9/site-packages/google/cloud/_http/__init__.py", line 379 in _do_request
  File "/venv/lib/python3.9/site-packages/google/cloud/_http/__init__.py", line 341 in _make_request
  File "/venv/lib/python3.9/site-packages/google/cloud/_http/__init__.py", line 482 in api_request
  File "/venv/lib/python3.9/site-packages/google/api_core/retry.py", line 190 in retry_target
  File "/venv/lib/python3.9/site-packages/google/api_core/retry.py", line 283 in retry_wrapped_func
  File "/venv/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 759 in _call_api
  File "/venv/lib/python3.9/site-packages/google/cloud/bigquery/client.py", line 1011 in get_table
  File "/venv/lib/python3.9/site-packages/sqlalchemy_bigquery/base.py", line 937 in _get_table
  File "/venv/lib/python3.9/site-packages/sqlalchemy_bigquery/base.py", line 950 in get_columns
  File "/venv/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 390 in get_columns
  File "/venv/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 664 in reflecttable
  File "/venv/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 484 in reflecttable
  File "/venv/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1653 in run_callable
  File "/venv/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2212 in run_callable
  File "/venv/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 670 in _autoload
  File "/venv/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 647 in _init
  File "/venv/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 558 in __new__
  File "/venv/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 139 in warned
  File "<string>", line 2 in __new__
  File "MY_CODE.py", line 619 in load_table
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Nov 9, 2022
@galz10 galz10 added type: question Request for information or clarification. Not an issue. and removed type: question Request for information or clarification. Not an issue. labels May 22, 2023
@meredithslota meredithslota added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

3 participants