Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy 2.0 will "clean up" the Python API (and change the C API) #1864

Open
hamogu opened this issue Aug 29, 2023 · 9 comments
Open

Numpy 2.0 will "clean up" the Python API (and change the C API) #1864

hamogu opened this issue Aug 29, 2023 · 9 comments

Comments

@hamogu
Copy link
Contributor

hamogu commented Aug 29, 2023

Numpy plans a release candidate for numpy 2.0 late in Dec 2023 with significant clean-up of the Python API and changes to the C API:
numpy/numpy#24300

There is almost certainly something lurking in Sherpa that will break. So, we should a "numpy < 2.0" in our requirements for this year and then start testing against numpy 2.0 when it becomes available.

@hamogu hamogu changed the title Numpy 2.0 will "cleam up" the Python API (and change the C API) Numpy 2.0 will "clean up" the Python API (and change the C API) Aug 29, 2023
@DougBurke
Copy link
Contributor

The following builds against numpy 1 but runs with numpy 2 (development) and shows some immediate problem areas:

% pip install -U --pre --only-binary :all: -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy
% pip install -e .
Obtaining file:///lagado.real/sherpa/sherpa-work
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in /lagado2.real/local/anaconda/envs/sherpa-work/lib/python3.10/site-packages (from sherpa==4.15.1+200.g9eef8f9c) (2.0.0.dev0)
Installing collected packages: sherpa
  Running setup.py develop for sherpa
Successfully installed sherpa-4.15.1+200.g9eef8f9c
% pytest
=================================================== test session starts ===================================================
platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0
rootdir: /lagado.real/sherpa/sherpa-work, configfile: pytest.ini
plugins: xvfb-2.0.0, doctestplus-0.12.0, forked-1.4.0, xdist-2.5.0
collected 0 items / 1 error                                                                                               

========================================================= ERRORS ==========================================================
______________________________________________ ERROR collecting test session ______________________________________________
/lagado2.real/local/anaconda/envs/sherpa-work/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/lagado2.real/local/anaconda/envs/sherpa-work/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
    exec(co, module.__dict__)
docs/conftest.py:29: in <module>
    from sherpa.utils.testing import get_datadir
sherpa/utils/__init__.py:95: in <module>
    SherpaFloat = numpy.float_
/lagado2.real/local/anaconda/envs/sherpa-work/lib/python3.10/site-packages/numpy/__init__.py:356: in __getattr__
    raise AttributeError(
E   AttributeError: `np.float_` was removed in the NumPy 2.0 release. Use `np.float64` instead.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
==================================================== 1 error in 0.62s =====================================================

@DougBurke
Copy link
Contributor

In doing this we will also need to worry about backend issues (I've just seen some in astropy.io.fits) so it'll be a slow process.

@DougBurke
Copy link
Contributor

@hamogu - here's an interesting one (hopefully astropy will hit this and come up with a plan): the string output for np values has changed, so how can we doctest this as the expected output depends on the numpy version:

% pytest sherpa/data.py 
=================================================== test session starts ===================================================
platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0
rootdir: /lagado.real/sherpa/sherpa-work, configfile: pytest.ini
plugins: xvfb-2.0.0, doctestplus-0.12.0, forked-1.4.0, xdist-2.5.0
collected 10 items                                                                                                        

sherpa/data.py ...F.F....                                                                                           [100%]

======================================================== FAILURES =========================================================
___________________________________________ [doctest] sherpa.data.Data1D.notice ___________________________________________
1776         the range of noticed data (when `ignore=False`).
1777 
1778         Examples
1779         --------
1780 
1781         >>> import numpy as np
1782         >>> x = np.arange(0.4, 2.6, 0.2)
1783         >>> y = np.ones_like(x)
1784         >>> d = Data1D('example', x, y)
1785         >>> d.x[0], d.x[-1]
Expected:
    (0.4, 2.4000000000000004)
Got:
    (np.float64(0.4), np.float64(2.400000000000001))

/lagado.real/sherpa/sherpa-work/sherpa/data.py:1785: DocTestFailure
_________________________________________ [doctest] sherpa.data.Data1DInt.notice __________________________________________
1981 
1982         Examples
1983         --------
1984 
1985         >>> import numpy as np
1986         >>> edges = np.arange(0.4, 2.6, 0.2)
1987         >>> xlo, xhi = edges[:-1], edges[1:]
1988         >>> y = np.ones_like(xlo)
1989         >>> d = Data1DInt('example', xlo, xhi, y)
1990         >>> d.xlo[0], d.xhi[-1]
Expected:
    (0.4, 2.400000000000001)
Got:
    (np.float64(0.4), np.float64(2.400000000000001))

/lagado.real/sherpa/sherpa-work/sherpa/data.py:1990: DocTestFailure
=============================================== 2 failed, 8 passed in 0.09s ===============================================

and this is a np change:

>>> import numpy as np
>>> np.__version__
'2.0.0.dev0+git20230828.ef0fd39'
>>> np.asarray([1.2, 1.3])
array([1.2, 1.3])
>>> np.asarray([1.2, 1.3])[0]
np.float64(1.2)

@hamogu
Copy link
Contributor Author

hamogu commented Aug 30, 2023

I suggest we start worrying about all of those when the numpy 2.0 RC is out
EDIT: I had said "Jan 2024" when I typed this originally. That's wrong, the release is planned for Dec 2023.
Things might still change on the numpy side and there is not much we can do now - I think we should go in with the requirement "numpy < 2.0" for the 4.16 release anyway as we can't guarantee that we really caught all cases until we have a release-ready version of numpy to test against.

Numpy plans to supply a conversion tool similar to 2to3 form the Python 2 to 3 conversion (though, they say, less sophisticated so it won't catch call cases, but to most of the automatic, tedious conversion); so it's probably a waste of time to do changes now that a script could quickly do for us later. It's also not as big a change as Python 2->3 was. Numpy wants to make sure that all 2.0 compatible code also runs with 1.x; they are mostly dropping aliases (e.g. np.float_ vs np.float64).

The main reason to look a little closer at this now is if we do changes in how we call numpy in a way that impacts users now (e.g. #1735). In that case, we should make sure that we design it in such a way that it's numpy 2x compatible. However, for #1735 specifically, I did not see anything about dropping the numpy legacy API for random numbers, so it's not an issue there, but we should keep that in mind for other PRs.

For the specific case of doctests that @DougBurke raised above, one simple solution would be to run the doctests only with one version of numpy (either 1.x or 2.0) and change them when we change that numpy version. In my mind, doctests don't have to be run with all version of our matrix. We want to make sure they work, but we don't want to break it for details of the print output, precisely because they can depend sensitively the version of Python and numpy.
A few weeks ago, astropy moved doctestplus upstream to scientific python and one thing that was promised was that one of the numpy developers would work on it and provide an "auto-fix-outputs" flag that would automatically update all outputs that are part of doctests; you would run that when you update e.g. the numpy version used to run the doctests so you know that any changes are not part of your code, they are just changes from the dependencies. That's not available yet, but might well be in January. So again, I just would not worry about it yet.

@hamogu
Copy link
Contributor Author

hamogu commented Aug 30, 2023

That's not to say that I'm opposed to relatively small PRs like #1876 that @DougBurke put in while I was typing the above; anything that we can easily tick off right now increases the chances that (most of) Sherpa might just work with numpy 2.0, but I don't think it's worth our effort to spend excessive amounts of time to fix issues like string output in doctests that won't directly impact user's codes.

@DougBurke
Copy link
Contributor

One issue with the "only run doctests with a fixed version of X, Y, and Z" idea is that that means the tests become useless on my laptop where I'm running with different versions. However, that's something to worry about another day.

@DougBurke
Copy link
Contributor

@hamogu - from numpy/numpy#24300 they say

The tentative release date for NumPy 2.0 is end of December 2023.

You mentioned above that the RC wouldn't be out until January 2024, which is a rather less-aggressive schedule.

@hamogu
Copy link
Contributor Author

hamogu commented Aug 31, 2023

I was wrong, I don't know how January 2024 popped into my head; I'll edit the post above to clarify that I'm wrong.

@hamogu
Copy link
Contributor Author

hamogu commented Aug 31, 2023

One issue with the "only run doctests with a fixed version of X, Y, and Z" idea is that that means the tests become useless on my laptop where I'm running with different versions. However, that's something to worry about another day.

We will run them on CI in whatever version we choose. On your laptop, you have 50% chance to be in either <2.0 or >2.0 and CI will check if somethings goes wrong. Except for some string output, numpy promises that 2.0 code also runs with older versions, so what I'll do is to develop predominantly with 2.0 once it's out. I'm hoping that we can get the doctests to 2.0 relatively soon after. But that's just my plan - your mileage might vary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants