Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/system/test_pandas.py::test_insert_rows_from_dataframe is flaky #733

Closed
jimfulton opened this issue Jun 30, 2021 · 4 comments · Fixed by #832
Closed

tests/system/test_pandas.py::test_insert_rows_from_dataframe is flaky #733

jimfulton opened this issue Jun 30, 2021 · 4 comments · Fixed by #832
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. testing type: process A process-related concern. May include testing, release, or the like.

Comments

@jimfulton
Copy link
Contributor

In CI, this test has failed at least twice in the last day.

>       assert len(row_tuples) == len(expected)
E       AssertionError: assert 0 == 6
E        +  where 0 = len([])
E        +  and   6 = len([(1.11, True, 'my string', 10), (2.22, False, 'another string', 20), (3.33, False, 'another string', 30), (4.44, True, 'another string', 40), (5.55, False, 'another string', 50), (6.66, True, None, 60)])

tests/system/test_pandas.py:696: AssertionError

Perhaps this due to eventual consistency and we need to poll until we get 6 results.

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jun 30, 2021
@plamut
Copy link
Contributor

plamut commented Jun 30, 2021

Interestingly, I haven't observed this in the last 2+ years. As per docs, streamed data should be available within a few seconds, but something might have changed so that even this is sometimes too late.

@shollyman Is there a reasonable upper bound on those "few seconds"? Just to know how long does it make sense to keep polling in the test.

@plamut
Copy link
Contributor

plamut commented Jul 1, 2021

Also linking a comment by @tseaver for reference.

@tswast tswast added the type: process A process-related concern. May include testing, release, or the like. label Jul 1, 2021
@tswast
Copy link
Contributor

tswast commented Jul 1, 2021

Queries should be reading from the streaming buffer, so I'm a bit surprised this started flaking.

Edit: That's why we run a query here instead of using list_rows

rows = list(
bigquery_client.query(
"SELECT * FROM `{}.{}.{}`".format(
table.project, table.dataset_id, table.table_id
)
)
)

@tswast
Copy link
Contributor

tswast commented Jul 28, 2021

I'm seeing the "within a few seconds" language now. I think that's a bit different from how it used to be. Polling until we get 6 results (max a few times) seems the right option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. testing type: process A process-related concern. May include testing, release, or the like.
Projects
None yet
3 participants