Skip to content
Siu Kwan Lam edited this page Nov 22, 2022 · 1 revision

Numba Meeting: 2022-11-22

Attendees: Siu Kwan Lam, Todd A. Anderson, Andre Masella, Graham Markall, Guilherme, Ianna Osborne, Mathew Murray, stuart, Kaustubh Chaudhari, brandon willard FPOC (last week): Siu FPOC (incoming): Stuart

NOTE: All communication is subject to the Numba Code of Conduct.

Please refer to this calendar for the next meeting date.

0. Discussion

  • math functions redo (Siu/Stu)

    • problems:
      • untangle incorrect interdependency of math.*, np.*
        • e.g. lack of error checking
          • different behavior at boundary values betwen math vs numpy math functions.
      • inconsistent performance e.g. simd-vectorize
      • CUDA math funcs directly implemented on CUDA intrinsics
    • proposed solution:
      • implement correctly; i.e. follow cpython or numpy
      • performance warning on math due to inability to simd-vectorize
      • advise users to use numpy for simd-vectorize-able code
    • TODO: make discourse topic for boarder discussion
    • For 0.57
      • Ufunc equivalences for math functions
      • Adds warnings to suggest people to use the ufunc equivalence function
    • No breaking or performance changes until 0.58+
    • Stuart don't want to do it in pieces. So, it will be all or nothing.
    • Need to add an RFC category in discourse.
  • Extension hooks (analysis and lowering).

    • Numba has a few extension hooks. These are global dictionaries that maps IR nodes to functions (see: numba.core.analysis.*_extension_* at global scope). For instance, there was a lowering hook to add custom lowering for custom IR nodes.
    • The question: Should this be the preferred way of doing this?
    • Todd: There are three possibilities:
      1. what we currently have
      2. do what we just did for lowering_extensions; i.e Convert into passes
        • Stuart: this is possible for all cases. (1) exists because of lack of pass management historically. This is better than inject into global states.
      3. Could move the analysis etc calls into the IR node types themselves. Disadvantage is spreading all these calls about.
    • Examples of these extensions mapping for reference:
      • numba.core.ir_utils.alias_analysis_extensions
      • numba.core.ir_utils.alias_func_extensions
      • numba.core.ir_utils.is_pure_extensions
    • Stuart Q. Is it acceptable to modify globally the compiler behavior?
    • Stuart Q. If possible to modify extension, how guarantee that extensions can work together?
    • Is there a compiler that has a plugin architecture?
      • GCC has a plugin API, there's a security project that builds plugins for this.
      • Doing things like adding new IR nodes is impacting core infrastructure.
      • For e.g. OpenMP various offload devices have plugin infrastructure, there's a standard API etc.
      • Are there invariants where all the things can agree to "play nicely" and not interfere with each other?
    • If Numba IR were more close to MLIR it might make it easier to allow extensions etc. But this sort of design is completely different to what exists at present.
    • Is a combination needed of 2 and 3. e.g. usedef analysis is fundamental to the node type.
    • Fundamentally, this might not be that pressing an issue, nothing is broken right now but the API is now definitely symmetric. Use this discussion as informative towards making extension points for a compiler toolkit.
    • Create issue or RFC to keep track of this information.
  • RFC Mission statement review time.

    • Please suggest comments for updates/suggestions for changes by end of November 2022.
    • Keep review cadence to 6m
  • When is the "jit" fallback to objmode being removed.

    • 2 release after full deprecation
    • full deprecation scheduled for 0.57.
    • should be possible to create custom target to do this if you really want it still.
    • forceobj and looplift will still work if specified.

New "Ready for Review" PRs

1. New Issues

  • #8603 - Input type None is not recognized
  • #8607 - getitem(array(float64, 2d, C), UniTuple(array(int64, 1d, C) x 2))
  • #8609 - CUDA: register an extension globally
  • #8610 - Optional UniTuple gives error depending on the signature
  • #8612 - numba for pytorch Tensor for multiple gpus
  • #8613 - Raise exception fails with whitespace after function def
  • #8615 - "SystemError: initialization of _internal failed without raising an exception"

Closed Issues

  • #8608 - Dynamic jitclass fails

2. New PRs

  • #8601 - Check for type mismatch in np.kron
  • #8604 - Allow empty tuple as size for numpy rng
  • #8605 - Support for CUDA fp16 math functions (part 1)
  • #8611 - [structref] Add a method for structref _Utils
  • #8614 - py3.11 hashing updates

Closed PRs

  • merged - #8600 - Chrome trace timestamp should be in microseconds not seconds.
  • merged - #8602 - Throw error for unsupported dunder methods
  • merged - #8606 - [Doc] Make the RewriteArrayExprs doc more precise

3. Next Release: Version 0.57.0/0.40.0, RC Jan 2023

Clone this wiki locally