Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unhashable type: 'list' when using list of boolean in in_() #489

Open
TeddyCr opened this issue Sep 5, 2022 · 2 comments
Open

unhashable type: 'list' when using list of boolean in in_() #489

TeddyCr opened this issue Sep 5, 2022 · 2 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@TeddyCr
Copy link
Contributor

TeddyCr commented Sep 5, 2022

Environment details

  • OS type and version: Mac M1 (MacOS Monterey)
  • Python version: python --version 3.9.9
  • pip version: pip --version 22.1.2
  • sqlalchemy-bigquery version: pip show sqlalchemy-bigquery 1.4.3
  • sqlalchemy 1.4.27

Steps to reproduce

  1. create a sqlalchemy session
  2. create a sqlalchemy declarative meta table object
  3. execute the following query
from sqlalchemy import case, column, func

...

sessiom.query(sum(case([(column(<column_name>).in_([True, False]), 1)], else_=0))).select_from(<ORM table object>).first()

Stack trace

SELECT SUM(CAST(CASE WHEN (`tag_based` IN UNNEST(%(tag_based_1:BOOL)s)) THEN %(param_1:INT64)s ELSE %(param_2:INT64)s END AS NUMERIC)) AS `countInSet` 
FROM `stackoverflow`.`badges`
Traceback (most recent call last):
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1702, in _execute_context
    context = constructor(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 1080, in _init_compiled
    param = {
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 1081, in <dictcomp>
    key: processors[key](compiled_params[key])
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/sql/sqltypes.py", line 1984, in process
    value = _strict_as_bool(value)
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/sql/sqltypes.py", line 1957, in _strict_as_bool
    if value not in self._strict_bools:
TypeError: unhashable type: 'list'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2810, in first
    return self.limit(1)._iter().first()
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2894, in _iter
    result = self.session.execute(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1689, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1614, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1481, in _execute_clauseelement
    ret = self._execute_context(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1708, in _execute_context
    self._handle_dbapi_exception(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2026, in _handle_dbapi_exception
    util.raise_(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1702, in _execute_context
    context = constructor(
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 1080, in _init_compiled
    param = {
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 1081, in <dictcomp>
    key: processors[key](compiled_params[key])
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/sql/sqltypes.py", line 1984, in process
    value = _strict_as_bool(value)
  File "/Users/teddycrepineau/.venv/omd_3_9/lib/python3.9/site-packages/sqlalchemy/sql/sqltypes.py", line 1957, in _strict_as_bool
    if value not in self._strict_bools:
sqlalchemy.exc.StatementError: (builtins.TypeError) unhashable type: 'list'
[SQL: SELECT SUM(CAST(CASE WHEN (`tag_based` IN UNNEST(%(tag_based_1:BOOL)s)) THEN %(param_1:INT64)s ELSE %(param_2:INT64)s END AS NUMERIC)) AS `countInSet` 
FROM `stackoverflow`.`badges`
 LIMIT %(param_3:INT64)s]

When comparing with other dialects (i.e. redshift), I can see that bigquery passes a list to the processor function in sqlalchemy/engine/default.py (line 1081) which causes an issue in sqlalchemy/sql/sqltypes.py (line 1957) where the below check is performed

    _strict_bools = frozenset([None, True, False])

    def _strict_as_bool(self, value):
        if value not in self._strict_bools:
            if not isinstance(value, int):
                raise TypeError("Not a boolean value: %r" % value)
            else:
                raise ValueError(
                    "Value %r is not None, True, or False" % value
                )
        return value

Other dialects seem to break down each parameter into a single value. For example, when performing the same query with redshift dialect I can see the following compiled parameters

[{'param_1': 1, 'param_2': 0, 'param_3': 1, 'is_customer_1_1': 'True', 'is_customer_1_2': 'False'}]

vs bigquery

[{'tag_based_1': [True, False], 'param_1': 1, 'param_2': 0, 'param_3': 1}]
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Sep 5, 2022
@zzzeek
Copy link

zzzeek commented Sep 21, 2022

it seems likely the issue stems from this where it appears the whole approach SQLAlchemy takes for IN is somehow modified or bypassed.

I see the comment "Bigquery can accept an ARRAY directly", OK that's nice, some other DBAPIs can also, but if that's the main rationale for this code, perhaps it's best to just let SQLAlchemy do the IN the way it wants to, so that things aren't breaking and SQLAlchemy isn't awkwardly pinned at 1.4.27 - then come to us and say, "I'd like SQLAlchemy to support a dialect flag that will use arrays directly for IN rather than expanding them". Because we can do that on our end, it will be tested, and it wont break (at least not this much). beyond that, IN expressions are usually pretty small and if the only point here is optimization, this is likely premature.

@hashboard-charlie
Copy link

hashboard-charlie commented Dec 13, 2022

For others who run into this issue, my team decided to work around it by converting IN expressions into a series of OR expressions when appropriate:

sa.or_(*[
    column == value if value is not None else column.is_(None)
    for value in values
])

We only applied this workaround for booleans, since that seems to be the only affected type, and the difference in performance (if there is any after query planning) between IN and the manual OR can't be that significant for booleans (which have a max of three distinct values in the statement).


While that was enough to get us unblocked, curious if there's an update on this from the package maintainers. IN expressions for boolean columns isn't that niche a use case, such that I'm surprised how low traffic this issue is. Mike suggested a concrete path forward (thanks for the detailed explanation!), but it isn't clear if any there's any plan to implement their advice.

Am I missing some other resolution?

@meredithslota meredithslota added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jun 13, 2023
@kiraksi kiraksi self-assigned this Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

5 participants