Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: expand pyarrow dependencies to include version 2 #368

Merged
merged 3 commits into from Nov 10, 2020

Conversation

tswast
Copy link
Contributor

@tswast tswast commented Nov 4, 2020

Pyarrow 2.0 includes several bug fixes. The wire format remains the same, so it continues to be compatible with the BigQuery Storage API.

@tswast tswast requested review from a team and stephaniewang526 November 4, 2020 22:23
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Nov 4, 2020
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Nov 4, 2020
@tswast
Copy link
Contributor Author

tswast commented Nov 4, 2020

Test failure is relevant

______ TestRowIterator.test_to_dataframe_timestamp_out_of_pyarrow_bounds _______

self = <tests.unit.test_table.TestRowIterator testMethod=test_to_dataframe_timestamp_out_of_pyarrow_bounds>

    @pytest.mark.xfail(
        six.PY2,
        reason=(
            "Requires pyarrow>-1.0 to work, but the latter is not compatible "
            "with Python 2 anymore."
        ),
    )
    @unittest.skipIf(pandas is None, "Requires `pandas`")
    @unittest.skipIf(pyarrow is None, "Requires `pyarrow`")
    def test_to_dataframe_timestamp_out_of_pyarrow_bounds(self):
        from google.cloud.bigquery.schema import SchemaField
    
        schema = [SchemaField("some_timestamp", "TIMESTAMP")]
        rows = [
            {"f": [{"v": "81953424000.0"}]},  # 4567-01-01 00:00:00  UTC
            {"f": [{"v": "253402214400.0"}]},  # 9999-12-31 00:00:00  UTC
        ]
        path = "/foo"
        api_request = mock.Mock(return_value={"rows": rows})
        row_iterator = self._make_one(_mock_client(), api_request, path, schema)
    
        df = row_iterator.to_dataframe(create_bqstorage_client=False)
    
        self.assertIsInstance(df, pandas.DataFrame)
        self.assertEqual(len(df), 2)  # verify the number of rows
        self.assertEqual(list(df.columns), ["some_timestamp"])
>       self.assertEqual(
            list(df["some_timestamp"]),
            [dt.datetime(4567, 1, 1), dt.datetime(9999, 12, 31)],
        )
E       AssertionError: Lists differ: [date[25 chars] 0, 0, tzinfo=<UTC>), datetime.datetime(9999, [23 chars]TC>)] != [date[25 chars] 0, 0), datetime.datetime(9999, 12, 31, 0, 0)]
E       
E       First differing element 0:
E       datetime.datetime(4567, 1, 1, 0, 0, tzinfo=<UTC>)
E       datetime.datetime(4567, 1, 1, 0, 0)
E       
E       + [datetime.datetime(4567, 1, 1, 0, 0), datetime.datetime(9999, 12, 31, 0, 0)]
E       - [datetime.datetime(4567, 1, 1, 0, 0, tzinfo=<UTC>),
E       -  datetime.datetime(9999, 12, 31, 0, 0, tzinfo=<UTC>)]

tests/unit/test_table.py:2345: AssertionError

@tswast
Copy link
Contributor Author

tswast commented Nov 5, 2020

Fixed the test failure in the latest commit. Even though the behavior is slightly different between pyarrow 1.0 and 2.0 as seen in the commit, I think it's worth keeping a wider range due to Arrow's use as a core library.

@tswast tswast added the automerge Merge the pull request once unit tests and other checks pass. label Nov 10, 2020
@gcf-merge-on-green gcf-merge-on-green bot merged commit cd9febd into googleapis:master Nov 10, 2020
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Nov 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants