Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: BigQuery tests fail with latest google-cloud-bigquery package #2280

Closed
tswast opened this issue Jul 14, 2020 · 2 comments
Closed

TST: BigQuery tests fail with latest google-cloud-bigquery package #2280

tswast opened this issue Jul 14, 2020 · 2 comments

Comments

@tswast
Copy link
Contributor

tswast commented Jul 14, 2020

After removing pymapd, I'm able to run the BigQuery tests locally. Several tests currently fail:

$ pytest ibis/bigquery/tests
============================================================ test session starts ============================================================
platform darwin -- Python 3.7.6, pytest-5.4.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/swast/src/ibis, inifile: setup.cfg
plugins: forked-1.2.0, mock-3.0.0, xdist-1.31.0, cov-2.8.1
collected 165 items                                                                                                                         

ibis/bigquery/tests/test_client.py .............................F.F...................F...............F.                              [ 41%]
ibis/bigquery/tests/test_compiler.py .............................................F..........................                         [ 85%]
ibis/bigquery/tests/test_datatypes.py ................x.x..xxx                                                                        [100%]

================================================================= FAILURES ==================================================================
__________________________________________________________ test_scalar_param_array __________________________________________________________

alltypes = BigQueryTable[table]
  name: swast-scratch.testing.functional_alltypes
  schema:
    index : int64
    Unnamed_0 : int...4
    date_string_col : string
    string_col : string
    timestamp_col : timestamp
    year : int64
    month : int64
df =       index  Unnamed_0    id  bool_col  tinyint_col  ...  date_string_col  string_col           timestamp_col  year  m...     True            6  ...         01/31/10           6 2010-01-31 05:06:13.650  2010      1

[7300 rows x 15 columns]

    def test_scalar_param_array(alltypes, df):
        param = ibis.param('array<double>')
        expr = alltypes.sort_by('id').limit(1).double_col.collect() + param
        result = expr.execute(params={param: [1]})
        expected = [df.sort_values('id').double_col.iat[0]] + [1.0]
>       assert result == expected
E       ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

ibis/bigquery/tests/test_client.py:366: ValueError
_________________________________________________________ test_scalar_param_nested __________________________________________________________

client = <ibis.bigquery.client.BigQueryClient object at 0x7fd4735baa90>

    def test_scalar_param_nested(client):
        param = ibis.param('struct<x: array<struct<y: array<double>>>>')
        value = collections.OrderedDict(
            [('x', [collections.OrderedDict([('y', [1.0, 2.0, 3.0])])])]
        )
        result = client.execute(param, {param: value})
>       assert value == result
E       AssertionError: assert OrderedDict([...0, 3.0])])])]) == {'x': array([...dtype=object)}
E         Differing items:
E         {'x': [OrderedDict([('y', [1.0, 2.0, 3.0])])]} != {'x': array([{'y': array([1., 2., 3.])}], dtype=object)}
E         Use -v to get the full diff

ibis/bigquery/tests/test_client.py:383: AssertionError
___________________________________________________________ test_large_timestamp ____________________________________________________________

client = <ibis.bigquery.client.BigQueryClient object at 0x7fd4735baa90>

    def test_large_timestamp(client):
        huge_timestamp = datetime.datetime(year=4567, month=1, day=1)
        expr = ibis.timestamp('4567-01-01 00:00:00')
>       result = client.execute(expr)

ibis/bigquery/tests/test_client.py:594: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
ibis/client.py:215: in execute
    result = self._execute_query(query_ast, **kwargs)
ibis/bigquery/client.py:421: in _execute_query
    return query.execute()
ibis/bigquery/client.py:182: in execute
    result = self._fetch(cur)
ibis/bigquery/client.py:171: in _fetch
    df = cursor.query.to_dataframe()
../../miniconda3/envs/ibis-dev/lib/python3.7/site-packages/google/cloud/bigquery/job.py:3374: in to_dataframe
    create_bqstorage_client=create_bqstorage_client,
../../miniconda3/envs/ibis-dev/lib/python3.7/site-packages/google/cloud/bigquery/table.py:1731: in to_dataframe
    df = record_batch.to_pandas()
pyarrow/array.pxi:587: in pyarrow.lib._PandasConvertible.to_pandas
    ???
pyarrow/table.pxi:1640: in pyarrow.lib.Table._to_pandas
    ???
../../miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/pandas_compat.py:766: in table_to_blockmanager
    blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes)
../../miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/pandas_compat.py:1102: in _table_to_blocks
    list(extension_columns.keys()))
pyarrow/table.pxi:1107: in pyarrow.lib.table_to_blocks
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   pyarrow.lib.ArrowInvalid: Casting from timestamp[us, tz=UTC] to timestamp[ns] would result in out of bounds timestamp: 81953424000000000

pyarrow/error.pxi:85: ArrowInvalid
____________________________________________________________ test_approx_median _____________________________________________________________

alltypes = BigQueryTable[table]
  name: swast-scratch.testing.functional_alltypes
  schema:
    index : int64
    Unnamed_0 : int...4
    date_string_col : string
    string_col : string
    timestamp_col : timestamp
    year : int64
    month : int64

    def test_approx_median(alltypes):
        m = alltypes.month
        expected = m.execute().median()
        assert expected == 7
    
        expr = m.approx_median()
        result = expr.execute()
>       assert result == expected
E       assert 6 == 7.0

ibis/bigquery/tests/test_client.py:718: AssertionError
_________________________________________________________________ test_cov __________________________________________________________________

alltypes = BigQueryTable[table]
  name: swast-scratch.testing.functional_alltypes
  schema:
    index : int64
    Unnamed_0 : int...4
    date_string_col : string
    string_col : string
    timestamp_col : timestamp
    year : int64
    month : int64
project_id = 'swast-scratch'

    def test_cov(alltypes, project_id):
        d = alltypes.double_col
        expr = d.cov(d)
        result = expr.compile()
        expected = f"""\
    SELECT COVAR_SAMP(`double_col`, `double_col`) AS `tmp`
    FROM `{project_id}.testing.functional_alltypes`"""
>       assert result == expected
E       AssertionError: assert 'SELECT\n  CO...nal_alltypes`' == 'SELECT COVAR...nal_alltypes`'
E         - SELECT COVAR_SAMP(`double_col`, `double_col`) AS `tmp`
E         + SELECT
E         +   COVAR_SAMP(ref_0
E         +   BigQueryTable[table]
E         +     name: swast-scratch.testing.functional_alltypes
E         +     schema:
E         +       index : int64...
E         
E         ...Full output truncated (40 lines hidden), use '-vv' to show

ibis/bigquery/tests/test_compiler.py:448: AssertionError
@tswast
Copy link
Contributor Author

tswast commented Jul 15, 2020

I filed googleapis/python-bigquery#168 for the ibis/bigquery/tests/test_client.py::test_large_timestamp failure, as that will require a fix upstream to fallback to datetime objects (or maybe provide fletcher as an option) to be able to handle these ranges of values.

@tswast
Copy link
Contributor Author

tswast commented Sep 3, 2020

test_large_timestamp error was fixed in google-cloud-bigquery. Tracking the rest of the failures in #2353

@tswast tswast closed this as completed Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant