Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: issue a warning if buggy pyarrow is detected #787

Merged
merged 1 commit into from Jul 21, 2021

Conversation

plamut
Copy link
Contributor

@plamut plamut commented Jul 21, 2021

Fixes #781.

Some pyarrow versions can cause issue when loading data from dataframe. This commit detects if such pyarrow version is installed and warns the user.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Some pyarrow versions can cause issue when loading data from dataframe.
This commit detects if such pyarrow version is installed and warns the
user.
@plamut plamut requested review from tswast and a team July 21, 2021 16:32
@plamut plamut requested a review from a team as a code owner July 21, 2021 16:32
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jul 21, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jul 21, 2021
@plamut
Copy link
Contributor Author

plamut commented Jul 21, 2021

@tswast Alternatively we could ban pyarrow==2.0.0, since 1.0.x seems to work just fine, although that might be more disturbing. Besides, the bug seems to occur in more specific cases, such as having a large enough DataFrame with arrays of records, but seems to work fine otherwise (at least we haven't heard about anyone complaining to date).

Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach seems good to me. I agree that it must be a pretty narrow set of conditions where this matters. If someone is stuck on pyarrow 2.0, hopefully this will at least make them aware.

@tswast tswast merged commit e403721 into googleapis:master Jul 21, 2021
import tempfile
from typing import Any, BinaryIO, Dict, Iterable, Optional, Sequence, Tuple, Union
import uuid
import warnings

try:
import pyarrow

_PYARROW_VERSION = packaging.version.parse(pyarrow.__version__)
except ImportError: # pragma: NO COVER
pyarrow = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially leaves _PYARROW_VERSION undefined. Maybe:

Suggested change
pyarrow = None
pyarrow = None
_PYARROW_VERSION = None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only used in the path where pyarrow is definitely available (with a prior check), thus left it out. Although that implicit fact can change...

On the other hand, we are already making pyarrow required, thus it probably shouldn't be an issue in the transitional period, but a valid remark nevertheless.

@plamut plamut deleted the iss-781 branch July 21, 2021 17:21
ttobisawa pushed a commit to ttobisawa/python-bigquery that referenced this pull request Aug 31, 2021
* changes without context

        autosynth cannot find the source of changes triggered by earlier changes in this
        repository, or by version upgrades to tools such as linters.

* chore: add config / docs for 'pre-commit' support

Source-Author: Tres Seaver <tseaver@palladion.com>
Source-Date: Tue Dec 1 16:01:20 2020 -0500
Source-Repo: googleapis/synthtool
Source-Sha: 32af6da519a6b042e3da62008e2a75e991efb6b4
Source-Link: googleapis/synthtool@32af6da

* chore(deps): update precommit hook pre-commit/pre-commit-hooks to v3.3.0

Source-Author: WhiteSource Renovate <bot@renovateapp.com>
Source-Date: Wed Dec 2 17:18:24 2020 +0100
Source-Repo: googleapis/synthtool
Source-Sha: 69629b64b83c6421d616be2b8e11795738ec8a6c
Source-Link: googleapis/synthtool@69629b6

* chore: update noxfile.py.j2

* Update noxfile.py.j2

add changes from @glasnt to the template template to ensure that enforcing type hinting doesn't fail for repos with the sample noxfile (aka all samples repos)
See https://github.com/GoogleCloudPlatform/python-docs-samples/pull/4869/files for context

* fix typo

Source-Author: Leah E. Cole <6719667+leahecole@users.noreply.github.com>
Source-Date: Thu Dec 3 13:44:30 2020 -0800
Source-Repo: googleapis/synthtool
Source-Sha: 18c5dbdb4ac8cf75d4d8174e7b4558f48e76f8a1
Source-Link: googleapis/synthtool@18c5dbd

* chore(deps): update precommit hook pre-commit/pre-commit-hooks to v3.4.0

Co-authored-by: Tres Seaver <tseaver@palladion.com>

Source-Author: WhiteSource Renovate <bot@renovateapp.com>
Source-Date: Wed Dec 16 18:13:24 2020 +0100
Source-Repo: googleapis/synthtool
Source-Sha: aa255b15d52b6d8950cca48cfdf58f7d27a60c8a
Source-Link: googleapis/synthtool@aa255b1

* docs(python): document adding Python 3.9 support, dropping 3.5 support

Closes googleapis#787

Source-Author: Tres Seaver <tseaver@palladion.com>
Source-Date: Thu Dec 17 16:08:02 2020 -0500
Source-Repo: googleapis/synthtool
Source-Sha: b670a77a454f415d247907908e8ee7943e06d718
Source-Link: googleapis/synthtool@b670a77

* chore: exclude `.nox` directories from linting

The samples tests create `.nox` directories
with all dependencies installed. These directories
should be excluded from linting.

I've tested this change locally, and it significantly
speeds up linting on my machine.

Source-Author: Tim Swast <swast@google.com>
Source-Date: Tue Dec 22 13:04:04 2020 -0600
Source-Repo: googleapis/synthtool
Source-Sha: 373861061648b5fe5e0ac4f8a38b32d639ee93e4
Source-Link: googleapis/synthtool@3738610

* chore(python): fix column sizing issue in docs

Source-Author: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com>
Source-Date: Thu Jan 7 11:58:32 2021 -0700
Source-Repo: googleapis/synthtool
Source-Sha: f15b57ccfd71106c2299e9b89835fe6e55015662
Source-Link: googleapis/synthtool@f15b57c

* chore(python): use 'http' in LICENSE

Co-authored-by: Tim Swast <swast@google.com>

Source-Author: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com>
Source-Date: Thu Jan 7 13:05:12 2021 -0700
Source-Repo: googleapis/synthtool
Source-Sha: 41a4e56982620d3edcf110d76f4fcdfdec471ac8
Source-Link: googleapis/synthtool@41a4e56

* chore(python): skip docfx in main presubmit

* chore(python): skip docfx in main presubmit

* fix: properly template the repo name

Source-Author: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com>
Source-Date: Fri Jan 8 10:32:13 2021 -0700
Source-Repo: googleapis/synthtool
Source-Sha: fb53b6fb373b7c3edf4e55f3e8036bc6d73fa483
Source-Link: googleapis/synthtool@fb53b6f

* chore: add missing quotation mark

Source-Author: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com>
Source-Date: Mon Jan 11 09:43:06 2021 -0700
Source-Repo: googleapis/synthtool
Source-Sha: 16ec872dd898d7de6e1822badfac32484b5d9031
Source-Link: googleapis/synthtool@16ec872
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

load_table_from_dataframe produces incorrect results when used in list of dict
3 participants