Minutes_2022_11_01

Numba Meeting: 2022-11-01

Attendees: Val, Andre Masella, brandon willard, Guilherme, Jim Pivarski, Kaustubh Chaudhari, LI Da, Shannon Quinn, stuart, Todd A. Anderson, Siu Kwan Lam FPOC (last week): Val FPOC (Incoming): Graham

NOTE: All communication is subject to the Numba Code of Conduct.

Please refer to this calendar for the next meeting date.

0. Discussion

0.56.4 release (Val)
- Resolves issue with CUDA device array view on NumPy 1.23 (#8537)
- Aiming to tag by Thursday 3rd November
  - Artifact upload by Friday 4th November
py3.11 status (siu)
- https://github.com/numba/numba/pull/8545
- current task: stabilization for py<3.11
- testsuite 90% passing on py3.11
- Stuart: requests to put all the failing items on a list somewhere to track / sort
Default @overload target as 'generic' (#8554) (stu).
- Finale of 2 years effort to build framework for extending Numba with new targets
- Means that overloads can be called from CPU, CUDA, or any target by default and it should "just work" (modulo features that actually work on the target)
M1 LLVM assertion error at compile time update (Stu/Siu)
- Full writeup coming soon! Stay tunes!
- Just minimal Audio commentary for now.
- This is an LLVM issue for sure, probably not fixed with LLVM 14 either.
- Numba is just unlucky to have hit this, and probably it can't be fixed by changing Numba.
- Relies on large JIT compiled program, then allocate something large on the heap (100MB), then JIT something again, then....
- Current hypothsis: Runtime linker is unable to genrate jumps that cross a span of virtual memory triggers the assertion error.
- There is currently no fix for this on M1 and the failures are quite random. Bad news for M1.
- Platform conditional skip around the test that creates the giant memory?
- The change to leverage stencil tests to all Python versions is likely what started to trigger this.
- Maybe connect to LLVM devs like Lang or David Spiket (spelling?) or someone who knows about linker.
Adding support for new NumPy dtypes out-of-tree (Graham)
- https://github.com/gmarkall/numba-bfloat16/blob/main/prototype.py#L31
- https://github.com/GreenWaves-Technologies/bfloat16
- Adding support for a new DType that isn't in Numba, e.g. bfloat16
- What are thoughts to providing a public way to register new dtypes, rather than using numpy_support.FROM_DTYPE hacks.
- We stay at hacky monkey patch and wait for more concrete support in Numba.
Switch error mode to new_style (Guilherme)
- New style errors: if a standard exception (non-Numba exception) is raised anywhere in compilation, compilation stops immediately.
  - If you want to cause an error e.g. for a non-acceptable type, you need to raise a subclass of NumbaError instead.
- It's more useful when developing Numba compared to the old style
- Suggestion is to make it the default now
- Stuart: anything that motivates this? Answer: no
- Need to deprecate the old style of errors
  - Anyone using it might find their code doesn't compile when they upgrade
- Expected to work well given it's the default on CI, but need to warn users first in case they have overloads they're using.
- Conclusion: it is a good idea, but we need to deprecate the old style first before making new the default.
Writing/updating code to use libm as the base (stu)
- Various power functions math.pow, np.power, etc. - all doing similar things with slightly different implementations possibly referencing each other, making it difficult to debug and implement the expected behaviour in all cases.
- Question: "What would be a better abstraction for these implementations?"
- Differentiating the implementations at the point of libm function calls would be a better abstraction
  - E.g. libm function for power for CPU might call C libm
  - The power function for CUDA in libm might instead call libdevice
  - The power function overloads then all use overloaded libm instead of calling into other overloads like math.pow etc.
  - In general implementations should then be written in terms of libm to make behavior consistent.
- Questions:
  - Val: Would this involve vendoring a libm? Answer: no, we'd just need to wrap platform libm functions with a call to them - no need to jit compile low-level libm functions themselves. Will add a numba.libm module that stubs the functions, then we @overload the stubs for CUDA, CPU, etc.

1. New Issues

#8540 - Consider adding URL to Issue 6482 in _guard_py_ver message
#8543 - Rearanging the order of axes in cuda jitted code
- These can return new arrays, so it would have to be a view.
#8546 - Return the type(obj) in numba.typeof when DISABLE_JIT is enabled
- @sklam does not believe this is feasible and will comment on this issue.
#8551 - Support for np.linalg.svd(X, compute_uv=False)
- @stuartarchibald will have to sit down and do the work.
#8559 - Add file and line number information in NRT Debug statements.
- It may also be worth adding more docs.
#8561 - Numba 0.56.4 Checklist
#8563 - workqueue threading layer "safety" mechanism aliases nesting and concurrent access
- Parallel workqueue issue mentioned on Discourse (Graham):
- https://numba.discourse.group/t/terminating-nested-parallel-kernel-launch-detected-the-workqueue-threading-layer-does-not-supported-nested-parallelism/1621
llvmlite#885 - Python 3.11
- Related PR #869 needs CI changes to test on 3.11 as well
- Blocked on Python 3.11 Anaconda distribution Python

Closed Issues

#8548 - Azure CI builds of main failing on rstcheck

2. New PRs

#8544 - Remove reliance on npy_<impl> ufunc loops.
#8545 - Py3.11wip
#8547 - [Unicode] Add more string view usages for unicode operations
#8550 - Changes how tests are split between test instances
#8554 - Make target for @overload have 'generic' as default.
#8555 - Try xml test reporting in Azure
#8556 - Fix np.fmod to match numpy behavior
#8557 - [Unicode] support startswith with args, start and end.
#8558 - [Unicode] support int(str, base=10)
#8560 - CUDA: Get rid of _nvvm_options_type
#8562 - Cherry picks 0.56.4
llvmlite#886 - Simplify setup.py Python version guard
llvmlite#887 - Speeds up module.add_debug_info and add_metadata by 2x

Closed PRs

merged - #8541 - Remove restoration of "free" channel in Azure CI windows builds.
merged - #8542 - CUDA: Make arg optional for Stream.add_callback()
merged - #8549 - Fix rstcheck in Azure CI builds, update sphinx dep and docs to match
merged - #8552 - Update version support table for 0.56.4.
merged - #8553 - Update CHANGE_LOG for 0.56.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly