Skip to content
esc edited this page Mar 30, 2022 · 1 revision

Numba Meeting: 2022-03-29

Attendees: Siu Kwan Lam, Guilherme Leobas, Nick Riasanovsky, Andre Masella, brandon willard, stuart, Travis Oliphant, Val, Benjamin Graham, Jim Pivarski, Ehsan Totoni, Todd A. Anderson, Graham Markall

NOTE: All communication is subject to the Numba Code of Conduct.

Please refer to this calendar for the next meeting date.

0. Feature Discussion

  • Vision document update
    • Will be opening a PR for comments and suggestions
    • Follow up discussion next week
  • PTX Caching:
    • PoC Gist
    • PR: #7944 - [WIP] Pathfinding for CUDA kernel caching
    • Discussion:
      • worry that new CUDA driver results in unsupported version of PTX/cubin
      • likely fixed by adding cuda driver version into the cache key
  • CUDA discussion:
    • #7926 - CUDA: remove context query in launch configuration
    • #7927 - CUDA: discussion: retire the use of a separate launch configuration object, launch kernels directly!
      • What does the API for other libraries (CuPy, PyCUDA, etc.) look like?
        • CuPy: add_kernel((5,), (5,), (x1, x2, y)) # grid, block and arguments
        • PyCUDA: multiply_them(drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1), grid=(1,1))
  • Moving NRT to LLVM IR so it is JIT-able
    • for use-cases that requires allocating small objects frequently (e.g. in a loop)
    • expose LLVM to the details of allocation thus allowing more aggressive optimization on the memory allocation routines
      • potentially turn heap-alloc to stack-alloc
    • Nick: how often do we see this?
      • Stuart: quite frequently asked
    • Stu: NRT can prevent LLVM code motion because of the opaqueness of the NRT calls. Often a problem in container types.
    • Ben: Will this apply to Python C-API?
    • Stu: Don't think it will apply to that code path.

1. New Issues

  • #7928 - CUDA: functions can only be used as device functions or as kernels, not both in the same program.
  • #7929 - CUDA: Using NUMBA_CUDA_USE_NVIDIA_BINDING with other GPU Libraries
  • #7932 - llvmlite version issue while running tests
  • #7933 - Checking *args is empty inside overloaded method
  • #7935 - Experiment with adding a domain field to integer types
  • #7936 - Add official support for thread IDs
  • #7937 - enhancement for array_equal

Closed Issues

  • #7930 - Numba usage and documentation
  • #7931 - llvmlite too low version
  • #7934 - Incorrect prange loop index typing, but only in parallel mode

2. New PRs

  • #7938 - array_equal new impl
  • #7940 - Implemented np.allclose in numba/np/arraymath.py
  • #7941 - Remove debug dump output from closure inlining pass.

Closed PRs

  • #7939 - Fix rendering in feature request template.

3. Next Release: Version 0.56.0/0.39.0, RC,

Clone this wiki locally