Minutes_2018_09_25

Numba Meeting: 2018-09-25

Attendees: Stan, Ehsan, Stuart, Siu, Todd

0. Feature Discussion

String discussion
- Things we've learned about Python unicode so far
  - Get parts of unicode struct using PyUnicode_GET_LENGTH, PyUnicode_KIND,PyUnicode_DATA C macros. Need to wrap in helperlib.c function
  - Unicode strings built with old C API are not automatically in "canonical" form. Need to call PyUnicode_READY first to prepare them, which may do some memory allocation.
  - Create new unicode object with PyUnicode_FromKindAndData.
  - Once ready, data pointer can be used as a uint8, uint16 or uint32 array, depending on the value of kind.
- Right Numba extension API to use?
  - Built in types use registries
  - Documented extension API does not
  - Can cause strange behavior when using @jit vs. compile_isolated in unit tests?
- How to upgrade strings without breaking existing partial support that treats them as special constant type?
- How to handle memory ownership?
  - Similar to array: Numba can have view on Python-managed string storage. If Numba creates a string, we have to managed it with NRT in case it is put into jitclass.
  - Different: As soon as we box the Numba string into a Python unicode object, Python copies the string into its memory, and the refcount on NRT version should be decremented. (no Python unicode views on NRT managed memory)
  - Better: Strings being immutable does make reasoning about all of this easier.
Release lessons learned:
- Need to expand build farm. All test configurations take more than 24 hours to run on Windows. At least one more Windows GPU worker needed.
- Need automated job to clear old build working directories
- Checklist was very helpful!
- Automatic build and test of development wheels?
  - Slowest manual part of release
  - Automated testing as during release cycle to detect issues like #3341
  - Where should we post dev wheels to? Too many to post to PyPI. Is Anaconda.org compatible with most recent wheel package? (YES! latest update fixed it)

1. New issues

3341 - numba 0.40.0 installed via pip from PyPI: _dl_check_map_versions: Assertion `needed != NULL' failed!
- Can fix tbb part by rebuilding wheel
- auditwheel repair seems to be corrupting something causing the openmp segfault
- Once we fix or work around this, tag patch release 0.40.1
- Crashing numba -s
3339 - np.where typing needs to handle scalar selection with array input
- Bug report from gitter
3336 - Enable automatic parallel execution in pre-compiled code
- entangled with general problems with AOT and caching with parfors
- Also highlights missing compiler flag options with AOT compiling.
3334 - Copy to host broken for __cuda_array_interface__-derived strided DeviceNDArray.
- PR already submitted
3333 - Indexing broken for __cuda_array_interface__-derived DeviceNDArray.
- PR already submitted
3332 - Nested jit annotations bug
- Want to mix object mode and nopython mode in unusual way.
3331 - jit bug
- Hitting reflection of nested list limitation
- Another reminder that "reflection" is problematic
3330 - Type Refactor
- Placeholder for type refactoring brainstorming
3328 - Avoid unnecessary compilation in type inference
- Wishlist item from last meeting
- Note @overload vs @jit performance difference. Is compiling the wrapper causing the nopython function body to be compiled twice?
- TODO: check @overload vs @njit speed difference.
3327 - CUDA 10 notes
- current Numba seems to work with cudatoolkit 9 on new driver and with new cuda 10 toolkit directly
3326 - Heap Allocation Error at parallel execution
- list of list is very common
- how do we parallelize this?
- need legalization pass to raise errors when users do non-threadsafe things
3323 - Initial string support
- see discussion above

2. Open PRs

New

3340 - Fix typo in error name.
- easy fix
3337 - [WIP] Support for np.ediff1d
- in progress
3335 - Fix memory management of __cuda_array_interface__ views.
- Needs review
3325 - [WIP] Add location information to exceptions.
- Segfaults in strange ways on CI
- Very useful, but need to fix
3324 - Support for np.nancumsum and np.nancumprod
- Needs review

Old

3320 - Support for np.partition
3307 - Adding NUMBA_ENABLE_PROFILING envvar, enabling jit event
3228 - Reduce redundant module linking
- Stuart will review
3162 Support constant dtype string in nopython mode in functions like numpy.empty.
- Need to resolve #3195
3160 First attempt at parallel diagnostics
- Stuart will implement Todd's suggestion
3134 [WIP] Cfunc x86 abi
- Needs re-review
3124 Fix 3119, raise for 0d arrays in reductions
- Stuart needs to implement feedback
3046 Pairwise sum implementation.
#2999 Support LowLevelCallable
#2983 [WIP] invert mapping b/w binop operators and the operator module
#2950 Fix dispatcher to only consider contiguous-ness.
#2942 Fix linkage nature (declspec(dllexport)) of some test functions
#2894: [WIP] Implement jitclass default constructor arguments.
#2817: [WIP] Emit LLVM optimization remarks

===========================

4. Next Release: Version 0.41, RC=Nov 19, Final=Nov 26, 2018

Type refactoring
Parallel diagnostics
LLVM 7
Initial string support
Finishing off stalled PRs
Usual collection of bug fixes and small features

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes_2018_09_25

Numba Meeting: 2018-09-25

0. Feature Discussion

1. New issues

2. Open PRs

New

Old

4. Next Release: Version 0.41, RC=Nov 19, Final=Nov 26, 2018

Clone this wiki locally