Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: support NumPy 2.0 : NEP 52 (Python API cleanup) #8314

Open
wants to merge 62 commits into
base: main
Choose a base branch
from

Conversation

ev-br
Copy link
Contributor

@ev-br ev-br commented Apr 30, 2024

Work towards #8306.

The current checklist is

  • make sure cupy imports with numpy==2.0.0rc1
  • make sure pytest tests/cupy passes the collection stage
  • make sure pytest tests/cupyx passes the collection stage
  • make main cupy and numpy namespaces match
  • make ndarray methods match
  • function/method signatures:
    • keyword-only parameters
    • device arguments to various functions (needs decision on the behavior)
  • match submodule namespaces:
    • numpy.linalg
    • numpy.core,
    • numpy._lib
    • numpy.exceptions
    • numpy.strings
    • numpy.dtypes is out of scope for this PR.

@ev-br ev-br force-pushed the np2.0 branch 3 times, most recently from 14d3b3a to 33c9b22 Compare May 1, 2024 08:15
@ev-br
Copy link
Contributor Author

ev-br commented May 1, 2024

make main cupy and numpy namespaces match

Done, modulo

  • vecdot: is an enhancement
  • bitwise_count: is an enhancement
  • isdtype: did not touch dtypes yet
  • trapz/trapezoid --> numpy deprecated the former and renamed it to the latter. Can be done separately.

Details of namespace comparison is under the fold


In [1]: import cupy

In [2]: import numpy

In [3]: dnp = set([x for x in dir(numpy) if not x.startswith('_')])

In [4]: dcp = set(x for x in dir(cupy) if not x.startswith('_'))

In [5]: dnp.difference(dcp)
Out[5]: 
{'False_',
 'ScalarType',
 'True_',
 'apply_over_axes',
 'asmatrix',
 'bitwise_count',
 'block',
 'bmat',
 'bool',
 'busday_count',
 'busday_offset',
 'busdaycalendar',
 'bytes_',
 'char',
 'character',
 'clongdouble',
 'complex256',
 'core',
 'ctypeslib',
 'datetime64',
 'datetime_as_string',
 'datetime_data',
 'dtypes',
 'einsum_path',
 'emath',
 'errstate',
 'f2py',
 'flexible',
 'float128',
 'frompyfunc',
 'fromregex',
 'geomspace',
 'get_include',
 'getbufsize',
 'geterr',
 'geterrcall',
 'histogram_bin_edges',
 'info',
 'insert',
 'is_busday',
 'isdtype',
 'isnat',
 'little_endian',
 'long',
 'longdouble',
 'ma',
 'matrix',
 'matrix_transpose',
 'memmap',
 'nanpercentile',
 'nanquantile',
 'ndenumerate',
 'nditer',
 'nested_iters',
 'object_',
 'polyder',
 'polydiv',
 'polyint',
 'rec',
 'recarray',
 'record',
 'sctypeDict',
 'setbufsize',
 'seterr',
 'seterrcall',
 'show_runtime',
 'spacing',
 'str_',
 'strings',
 'test',
 'timedelta64',
 'trapezoid',
 'typecodes',
 'ulong',
 'vecdot',
 'void'}

In [6]: dcp.difference(dnp)
Out[6]: 
{'ElementwiseKernel',
 'RawKernel',
 'RawModule',
 'ReductionKernel',
 'asnumpy',
 'clear_memo',
 'cuda',
 'disable_experimental_feature_warning',
 'fromDlpack',
 'fuse',
 'get_array_module',
 'get_default_memory_pool',
 'get_default_pinned_memory_pool',
 'is_available',
 'memoize',
 'sparse'}

@ev-br
Copy link
Contributor Author

ev-br commented May 1, 2024

make ndarray methods match

Done modulo

  • ndarray.to_device: what should CuPy ndarrays do here?

In [1]: import cupy

In [2]: import numpy

In [3]: dnp = set(x for x in dir(numpy.ndarray) if not x.startswith('_'))

In [4]: dcp = set(x for x in dir(cupy.ndarray) if not x.startswith('_'))

In [5]: dnp.difference(dcp)
Out[5]: 
{'byteswap',
 'ctypes',
 'getfield',
 'itemset',
 'newbyteorder',
 'resize',
 'setfield',
 'setflags',
 'to_device',
 'tostring'}

In [6]: dcp.difference(dnp)
Out[6]: 
{'cstruct',
 'get',
 'reduced_view',
 'scatter_add',
 'scatter_max',
 'scatter_min',
 'set',
 'toDlpack'}

@ev-br
Copy link
Contributor Author

ev-br commented May 1, 2024

Paused to run locally tests from tests/cupy_tests folder. There are several classes of errors (some working notes are below the fold), the largest source of noise is NEP50 and dtype cast changes. I think I'll take a detour to look into NEP50 related issues before proceeding with a list in the OP.

OK  binary_tests/

    core_tests/
        arrays of different dtypes
        float16 values
        
OK  creation_tests/
OK  cuda_tests/
    fft_tests/
        FIXED DeprecationWarning: `axes` should not be `None` if `s` is not `None` (Deprecated in NumPy 2.0).
        numerical differences for float32

    functional_tests/
        can_cast(scalars)
        arrays of different dtypes
        vectorize_reciprocal: tolerance violation

    indexing_tests/
      FIXED: CuPy only, TypeError: Cannot cast int64 to uint8 in same_kind casting mode

OK  io_tests/

    lib_tests/
          float16
          scalar + numpy scalar  NEP50 in ufuncs

    linalg_tests/
        FIXED: cannot cast int64 -> uint64  (this is from cupy.random)
        float16 values fishy

    logic_tests/
       FIXED in1d is deprecated
       float16 arithmetics is broken (union1d etc)

    manipulation_tests/
        copyto with scalars
        float16 in unique
        row_stack deprecated
    math_tests/
       out of memory in TestCumprod.test_cumprod_huge_array
       float16 

OK  misc_tests/
OK  padding_tests/
OK  polynomial_tests/
OK  prof_tests/

    random_tests/
       FIXED: NumPy: TypeError: Cannot cast scalar from dtype('int64') to dtype('uint64') according to the rule 'safe'
      illegal pointer access in test_permutations

   sorting_tests/
         float16 : test_search.py::TestCubReduction::test_cub_arg{max, min}, test_sort
         test_argmax_all : fails intermittently
         arg{min, max}_int32_overflow: hangs
   
   statistics_tests/
        AssertionError: ndarrays of different dtypes are returned.
OK  test_cublas.py
OK  test_init.py
OK  test_ndim.py
    test_numpy_interop.py
        TestAsnumpy::test_asnumpy_blocking - cupy.cuda.memory.OutOfMemoryError
    test_type_routines.py
        TypeError: can_cast() does not support Python ints, floats, and complex because the result used to depend on the value.
OK  test_typing.py
    testing_tests/
        TestGenerateMatrix_param_7_{dtype=float16 : float16 values fishy

@ev-br
Copy link
Contributor Author

ev-br commented May 2, 2024

cross-ref numpy/numpy#26370 : de98f8f works around it, might be able to drop the workaround

@ev-br
Copy link
Contributor Author

ev-br commented May 7, 2024

Curiously, float16 arithmerics is broken under NumPy 2.0.0rc1. For instance,

(Pdb) p x
array([4., 1., 1., 1., 9., 9., 9.], dtype=float16)
(Pdb) p y
array([4., 0., 5., 2., 0., 0., 5.], dtype=float16)
(Pdb) p xp
<module 'cupy' from '/home/ev-br/repos/cupy_ng/cupy/__init__.py'>
(Pdb) p xp.union1d(x, y)
array([4., 1., 9., 4., 0., 5., 2., 0., 5.], dtype=float16)

A significant fraction of remaining test failures are due to float16.

@ev-br
Copy link
Contributor Author

ev-br commented May 7, 2024

Decision needed: numpy.in1d emits DeprecationWarnings. 39cdc53 does this:

  • stop testing cupy.in1d, only test cupy.isin
  • keep cupy.in1d in the public API.
  • do we want cupy.in1d also emit DeprecationWarnings? As is, it does not.

More generally, what is the CuPy action when NumPy deprecates something? (match the deprecation, just remove an API when numpy removes it, something else?)

Copy link
Contributor

mergify bot commented May 13, 2024

This pull request is now in conflicts. Could you fix it @ev-br? 🙏

ev-br added 13 commits May 13, 2024 07:12
These tests fail with "arrays of different dtypes are returned" on
NumPy 1.26.4 with weak promotion (export NPY_PROMOTION_STATE=weak),
but pass under NumPy '2.1.0.dev0+git20240419.da95f8e'.

Therefore, this commit can be reverted when NumPy 2.0 becomes the
minimum supported version.
The NumPy < 2 idiom,
int( np.array(seed).astype(np.uint64, casting='safe') )
does not work in NumPy 2.0
Test that either both NumPy and CuPy raise OverflowErrors,
or both do not.
Otherwise, `arange(3, dtype=cupy.uint8).round(-2)` raises: `-2` cannot be cast
to the array unsigned dtype.
The issue this: consider `cupy.uint8(2) / (-2)`. Since -2 is a
python scalar, the ufunc selects a 'BB->d' loop based on the type of
the other argument. Then the ufunc kernel declared
`in1_type in1` which is initialized `uint8_t in1(-2)`.

With only 'int64, int64 -> float64' and 'uint64, uint64_5 -> float64'
loops (this PR), the problem appears only for uint64(...) / (-2),
and that's probably rare enough.
@ev-br ev-br force-pushed the np2.0 branch 2 times, most recently from c01ad40 to dc148b1 Compare May 14, 2024 10:49
seberg added a commit to seberg/cupy-feedstock that referenced this pull request May 23, 2024
This (tries to) build artifacts based on Evgeni's PR to CuPy which should
have most fixes needed to run with NumPy 2 and thus allow testing downstream
projects with NumPy 2 if they also need CuPy.

This includes most/all necessary Python fixes as well as fixes to make
promotion align with NumPy (NEP 50).

See also:
* cupy/cupy#8323
* cupy/cupy#8314
* cupy/cupy#8306
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:numpy-compat Follow the NumPy/SciPy spec changes prio:medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants