Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: expand supported pyarrow versions to v4 #643

Merged
merged 3 commits into from May 5, 2021

Conversation

plamut
Copy link
Contributor

@plamut plamut commented May 3, 2021

Closes #635

PR checklist:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

@plamut plamut requested review from tswast and a team May 3, 2021 09:13
@plamut plamut requested a review from a team as a code owner May 3, 2021 09:13
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label May 3, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label May 3, 2021
Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you determine that PyArrow 4 was acceptable? When I run nox on your branch, it uses PyArrow 3. This is a bit of a process question, which I'll be faced with sqlalchemy soon.

@plamut
Copy link
Contributor Author

plamut commented May 3, 2021

Samples, which cover typical use cases, now run with v4 and the tests pass.

Also, by default pip installs the latest compatible versions, which are then used in the tests, with the exception of Python 3.6 where we constrain the versions to the oldest one still supported.

@jimfulton
Copy link
Contributor

Are you sure? :)

How can you tell?

When I run full nox locally pyarrow 3 is installed (except for Python 3.6 which uses 1).

What do you get when you run:

find . -name pyarrow-\*.dist-info

?

@plamut
Copy link
Contributor Author

plamut commented May 3, 2021

If pyarrow is pre-installed, pip does not upgrade it for me, because the dependency is already satisfied, but, but in a fresh virtualenv pyarrow==4.0.0 gets installed:

$ pip install -e .[all]

The output of find . -name pyarrow-\*.dist-info:

./env/3.9/lib/python3.9/site-packages/pyarrow-3.0.0.dist-info
./env/3.8/lib/python3.8/site-packages/pyarrow-4.0.0.dist-info
./env/3.7/lib/python3.7/site-packages/pyarrow-1.0.1.dist-info
./env/testpy/lib/python3.8/site-packages/pyarrow-4.0.0.dist-info   #  <--- this is the fresh virtualenv
./env/3.6/lib/python3.6/site-packages/pyarrow-3.0.0.dist-info
./env/3.5/lib/python3.5/site-packages/pyarrow-0.16.0.dist-info
./env/2.7/lib/python2.7/site-packages/pyarrow-0.16.0.dist-info

However, I did notice that nox actually installed 3.0.0 (no pre-existing nox cache), which is a surprise. Is this what you were referring to?

./.nox/unit-3-8/lib/python3.8/site-packages/pyarrow-3.0.0.dist-info

For some reason different versions are installed in a nox session than when creating a fresh environment manually, hmm...

Update:
For samples pyarrow==4.0.0 is installed. Huh?

$ pwd
/.../python-bigquery/samples/snippets
$ find . -name pyarrow-\*.dist-info
./.nox/py-3-8/lib/python3.8/site-packages/pyarrow-4.0.0.dist-info

@jimfulton
Copy link
Contributor

I removed my .nox directory and then ran nox and got pyarrow 3 (or 1 for Python 3.6).

./.nox/unit-3-7/lib/python3.7/site-packages/pyarrow-3.0.0.dist-info
./.nox/pytype/lib/python3.8/site-packages/pyarrow-3.0.0.dist-info
./.nox/system-3-8/lib/python3.8/site-packages/pyarrow-3.0.0.dist-info
./.nox/unit-3-8/lib/python3.8/site-packages/pyarrow-3.0.0.dist-info
./.nox/unit-3-9/lib/python3.9/site-packages/pyarrow-3.0.0.dist-info
./.nox/snippets-3-8/lib/python3.8/site-packages/pyarrow-3.0.0.dist-info
./.nox/docs/lib/python3.8/site-packages/pyarrow-3.0.0.dist-info
./.nox/unit-3-6/lib/python3.6/site-packages/pyarrow-1.0.0.dist-info

IDK how nox+pip picks versions. At some point, I thought someone said we somehow arranged to pick minimal versions, but I don't see anything like that in how pip is invoked and I've been generally perplexed by versions chosen.

Again, I'm not really trying to say anything is wrong. I'm just trying to understand our approach.

@plamut
Copy link
Contributor Author

plamut commented May 3, 2021

@jimfulton I noticed the same, got the version 3.0.0 installed. AFAIK pip always tries to install the most recent compatible version, and it does so for samples tests, but for some reason this assumption does not hold for the top level noxfile sessions...

Blocking until we get to the bottom of this.

@plamut plamut added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label May 3, 2021
@jimfulton
Copy link
Contributor

How are you running the samples tests? Is that with nox -s snippets?

I'm getting pyarrow 3 for snippets.

BTW, I agree wrt expected pip behavior.

@plamut
Copy link
Contributor Author

plamut commented May 3, 2021

I got version 4 when using the snippets' own noxfile:

$ cd <repo_root>/samples/snippets
$ python -m nox -s py-3.8

I think we could actually benefit to run pip freeze on Kokoro before the start of the each test session so that we know which versions the CI checks actually used (there's no such info in the logs right now).

@jimfulton
Copy link
Contributor

Me too. I got pyarrow 4 when running nox from the snippets directory.

setup.py Outdated Show resolved Hide resolved
@jimfulton
Copy link
Contributor

Maybe in cases like this, we should add a constraint. When I added pyarrow>=4.0.0 to the latest constraint file, the error became apparent. :)

@plamut plamut removed the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label May 3, 2021
@plamut plamut requested a review from jimfulton May 3, 2021 15:48
@plamut
Copy link
Contributor Author

plamut commented May 3, 2021

Constraints to force the very latest versions to be used in tests?

Might work, although that can still bite us when, say, the renovate bot updates a version pin, but we forget about the constraint that restricts it to a less recent version. I feel that's actually too likely to happen...

@jimfulton
Copy link
Contributor

I was suggesting adding pyarrow>=4.0.0 to constraints-3.9.txt.

Maybe tomorrow I'll ask you to explain how renovate bot works. ATM, I'm not impressed. :)

@plamut
Copy link
Contributor Author

plamut commented May 5, 2021

@jimfulton Do you feel strongly about adding pyarrow>=4.0.0 to constraints-3.9.txt right now in this PR, or should we defer that until it is decided how to do that holistically?
(probably by re-configuring the renovate bot if I understood the yesterday's offline discussion correctly).

@jimfulton
Copy link
Contributor

I would add it to protect us. It doesn't harm anything. Remember we had an uncaught bug that this would have caught.

I plan to do the same when making sure we test the sqlalchemy dialect with 1.4.

@plamut
Copy link
Contributor Author

plamut commented May 5, 2021

Makes sense, will add it right now then.

@plamut plamut requested review from jimfulton and removed request for jimfulton May 5, 2021 13:02
@plamut plamut merged commit 9e1d386 into googleapis:master May 5, 2021
@plamut plamut deleted the iss-635 branch May 5, 2021 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

deps: expand extras to support pyarrow v4
2 participants