Minutes_2022_03_29

Numba Meeting: 2022-03-29

Attendees: Siu Kwan Lam, Guilherme Leobas, Nick Riasanovsky, Andre Masella, brandon willard, stuart, Travis Oliphant, Val, Benjamin Graham, Jim Pivarski, Ehsan Totoni, Todd A. Anderson, Graham Markall

NOTE: All communication is subject to the Numba Code of Conduct.

Please refer to this calendar for the next meeting date.

0. Feature Discussion

Vision document update
- Will be opening a PR for comments and suggestions
- Follow up discussion next week
PTX Caching:
- PoC Gist
- PR: #7944 - [WIP] Pathfinding for CUDA kernel caching
- Discussion:
  - worry that new CUDA driver results in unsupported version of PTX/cubin
  - likely fixed by adding cuda driver version into the cache key
CUDA discussion:
- ~~#7926 - CUDA: remove context query in launch configuration~~
- #7927 - CUDA: discussion: retire the use of a separate launch configuration object, launch kernels directly!
  - What does the API for other libraries (CuPy, PyCUDA, etc.) look like?
    - CuPy: add_kernel((5,), (5,), (x1, x2, y)) # grid, block and arguments
    - PyCUDA: multiply_them(drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1), grid=(1,1))
Moving NRT to LLVM IR so it is JIT-able
- for use-cases that requires allocating small objects frequently (e.g. in a loop)
- expose LLVM to the details of allocation thus allowing more aggressive optimization on the memory allocation routines
  - potentially turn heap-alloc to stack-alloc
- Nick: how often do we see this?
  - Stuart: quite frequently asked
- Stu: NRT can prevent LLVM code motion because of the opaqueness of the NRT calls. Often a problem in container types.
- Ben: Will this apply to Python C-API?
- Stu: Don't think it will apply to that code path.

1. New Issues

#7928 - CUDA: functions can only be used as device functions or as kernels, not both in the same program.
#7929 - CUDA: Using NUMBA_CUDA_USE_NVIDIA_BINDING with other GPU Libraries
#7932 - llvmlite version issue while running tests
#7933 - Checking *args is empty inside overloaded method
#7935 - Experiment with adding a domain field to integer types
#7936 - Add official support for thread IDs
#7937 - enhancement for array_equal

Closed Issues

#7930 - Numba usage and documentation
#7931 - llvmlite too low version
#7934 - Incorrect prange loop index typing, but only in parallel mode

2. New PRs

#7938 - array_equal new impl
#7940 - Implemented np.allclose in numba/np/arraymath.py
#7941 - Remove debug dump output from closure inlining pass.

Closed PRs

#7939 - Fix rendering in feature request template.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly