Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packaging NumPy 2.0.0rc1 #311

Open
jakirkham opened this issue Mar 11, 2024 · 10 comments
Open

Packaging NumPy 2.0.0rc1 #311

jakirkham opened this issue Mar 11, 2024 · 10 comments

Comments

@jakirkham
Copy link
Member

jakirkham commented Mar 11, 2024

Recently NumPy tagged v2.0.0b1 v2.0.0rc1. This includes a number of changes documented in these release notes

To gain familiarity with it and make it easier for developers, feedstock maintainers, etc. to test with it, think we may benefit from packaging the beta here following CFEP 5

Bringing this up here to discuss. Also opening for discussion of any other next steps

References:

Edit: Updated to 2.0.0rc1 now that it is tagged and packaged. Though this was originally about 2.0.0b1

Edit 2: Link ecosystem NumPy 2 compatibility tracking issue

@rgommers
Copy link
Contributor

(disclaimer: I'm not that familiar with how pre-releases are done in conda-forge)

My current impression:

  • It seems useful to package 2.0.0b1,
  • It also seems useful for projects which depend on NumPy and use the NumPy C API in nontrivial ways to have draft PRs opened to build against 2.0.0b1,
    • a little less clear if such PRs should be merged; the ABI is unlikely to change for 2.0.0rc1, but it may
    • having a larger set of builds against rc1 would make sense, it'd be quite useful to smoke out problems

Questions:

  1. The pin_compatible mechanism will need adjusting probably. What would be preferred, having:
    • (a) builds against 1.2Xx with runtime dependency >=1.2x,<2 and a second set of builds against numpy 2.0 with runtime dependencies >=2.0, or
    • (b) only build against 2.0, with runtime dependencies >=1.2x?
  2. Would it make sense to have a migrator that gets the whole stack ready in the period between 2.0.0rc1 and 2.0.0 final?

1(b) seems nicer to me (avoids building everything twice), but I'm not sure what has to be adjusted to make that happen.

@jakirkham
Copy link
Member Author

Thanks Ralf! 🙏


We can certainly publish the b1 packages and we can open test PRs on other feedstocks using b1. The test PRs could...

  • Cache build artifacts (including packages) on CI
    • Primarily for testing NumPy packages themselves
    • Useful if we are concerned about API breakages (no need to worry about broken packages)
    • Smaller testing pooling (maintainers of associated feedstocks, developers on those libraries)
  • Publish the downstream packages to a RC label
    • Considers broader ecosystem testing with NumPy
    • Easy to construct complex environments
    • Useful if want broader consumption of these packages (average conda-forge user)

My guess is for b1 having open test PRs and not merging is fine. Whereas for rc1 we may be more interested in merging. Really this is a question of what kind (and how much) testing we want to do and what level of engagement we are looking for


The 1(a) approach is easy to roll out with our current setup. Though is more resource intensive over the long term

The 1(b) approach may take a bit more work to roll out. That said, it is lighter on resource consumption and more flexible for users

There is a bit of discussion about this particularly in issue ( conda-forge/conda-forge-pinning-feedstock#4816 ). One concern is other build tools or libraries could alter settings we provide ( conda-forge/conda-forge-pinning-feedstock#4816 (comment) ). This is why having a way to audit binaries for NumPy compatibility ( numpy/numpy#25948 ) would help us validate intended NumPy compatibility (or fail builds that are out of alignment)

Agree 1(b) would be nicer. Just need to make sure that we are handling the aforementioned issue correctly

A migrator would mainly be needed for 1(a). It may or may not be needed for 1(b) depending on how we handle other steps

@h-vetinari
Copy link
Member

We should IMO clearly do 1(b). Double numpy builds until we're on 2.0+ would blow out our CI matrices.

The validation that no package scripts override NPY_TARGET_VERSION in contradiction with our expectation (and metadata) would be great to have, but can follow when it's ready (in the meantime, it would just be a regular packaging bug if NPY_TARGET_VERSION gets wrongly set and mismatches our infra; we'd be able to easily fix it per feedstock).

I've opened #312 to build 2.0.0b1 for the numpy_dev label.

@jakirkham
Copy link
Member Author

That would be my preference as well. Though it goes beyond even CI resources and provides value to developers and end users as they can more easily test and switch between NumPy 1.x & 2.x

AIUI (and Ralf or Sebastian can correct me) NPY_FEATURE_VERSION is set to NPY_TARGET_VERSION when set. Binaries then built with NumPy's C-API would bake in NPY_FEATURE_VERSION thanks to PR ( numpy/numpy#25948 ). So we should be able to check the strings of the final binaries to see whether they are built with the NPY_TARGET_VERSION we set (or error otherwise)

@h-vetinari
Copy link
Member

So we should be able to check the strings of the final binaries to see whether they are built with the NPY_TARGET_VERSION we set (or error otherwise)

Yeah, I think the pieces for this are mostly in place, except for the part where that test shouldn't have to be written & rewritten by every package compiling against numpy. That's the part that'll probably take a bit of figuring out.

@jakirkham
Copy link
Member Author

jakirkham commented Mar 19, 2024

Agreed

Some possible general test locations (please feel free to propose more):

  1. Bake into conda-build's own package (perhaps using logic similar to inspect_linkages)
  2. Include some new test in conda-smithy feedstock templates
  3. Add logic to feedstock output validation
  4. Handle in package validation
  5. ...?

@seberg
Copy link

seberg commented Mar 19, 2024

In theory it might be possible to have a setup that requires no explicit specification. If you have:

  • a numpy build requirement.
  • The lowest NumPy runtime version the package could be installed with.

Then you could scan the shipped .so files to see if any file is in violation. But I don't know how much you want numpy-specific hooks.

In practice, I still think that exceedingly few packages should bother to do use this mechanism (and thus get a the long default).
So NumPy runtime version limits will in practice not be due to C-API compilation options, and in that sense I am not sure how much you need to worry about it (although any test is nice!).

@rgommers
Copy link
Contributor

The validation that no package scripts override NPY_TARGET_VERSION in contradiction with our expectation (and metadata) would be great to have, but can follow when it's ready (in the meantime, it would just be a regular packaging bug if NPY_TARGET_VERSION gets wrongly set and mismatches our infra; we'd be able to easily fix it per feedstock).

I completely agree. This is mostly an academic worry at this point, and is going to apply to very few packages in the future.

The key things are (a) to start getting the build and runtime dependencies right as soon as possible after 2.0.0rc1 is out, and (b) add a permanent metadata patch for all built packages already out there which are not ABI-compatible with the NumPy 2.x ABI.

@jakirkham
Copy link
Member Author

jakirkham commented Apr 1, 2024

With PR ( #312 ) now merged, users can install with conda install conda-forge/label/numpy_dev::numpy=2.0.0rc1

@h-vetinari
Copy link
Member

Note that the setup here has changed a bit in order to allow the migration to proceed despite some constraints (e.g. we still have python 3.8, we need to keep strict channel priority, etc.). So as of #314, the packages here are pushed to the main conda-forge channel, but depend on _numpy_rc, which can only be installed from conda-forge/label/numpy_rc.

So from the POV of installation, not much has changed, except that one needs to provide the channel explicitly:

conda install -c conda-forge/label/numpy_rc numpy=2.0.0rc1

Also note that the respective label is now called "numpy_rc", as I felt that's more appropriate here than "numpy_dev"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants