Minutes_2018_09_25
Stan Seibert edited this page Sep 25, 2018
·
1 revision
Attendees: Stan, Ehsan, Stuart, Siu, Todd
-
String discussion
- Things we've learned about Python unicode so far
- Get parts of unicode struct using
PyUnicode_GET_LENGTH
,PyUnicode_KIND
,PyUnicode_DATA
C macros. Need to wrap in helperlib.c function - Unicode strings built with old C API are not automatically in "canonical" form. Need to call
PyUnicode_READY
first to prepare them, which may do some memory allocation. - Create new unicode object with
PyUnicode_FromKindAndData
. - Once ready, data pointer can be used as a
uint8
,uint16
oruint32
array, depending on the value ofkind
.
- Get parts of unicode struct using
- Right Numba extension API to use?
- Built in types use registries
- Documented extension API does not
- Can cause strange behavior when using
@jit
vs.compile_isolated
in unit tests?
- How to upgrade strings without breaking existing partial support that treats them as special constant type?
- How to handle memory ownership?
- Similar to array: Numba can have view on Python-managed string storage. If Numba creates a string, we have to managed it with NRT in case it is put into jitclass.
- Different: As soon as we box the Numba string into a Python unicode object, Python copies the string into its memory, and the refcount on NRT version should be decremented. (no Python unicode views on NRT managed memory)
- Better: Strings being immutable does make reasoning about all of this easier.
- Things we've learned about Python unicode so far
-
Release lessons learned:
- Need to expand build farm. All test configurations take more than 24 hours to run on Windows. At least one more Windows GPU worker needed.
- Need automated job to clear old build working directories
- Checklist was very helpful!
- Automatic build and test of development wheels?
- Slowest manual part of release
- Automated testing as during release cycle to detect issues like #3341
- Where should we post dev wheels to? Too many to post to PyPI. Is Anaconda.org compatible with most recent
wheel
package? (YES! latest update fixed it)
-
3341 - numba 0.40.0 installed via pip from PyPI: _dl_check_map_versions: Assertion `needed != NULL' failed!
- Can fix tbb part by rebuilding wheel
- auditwheel repair seems to be corrupting something causing the openmp segfault
- Once we fix or work around this, tag patch release 0.40.1
- Crashing numba -s
-
3339 - np.where typing needs to handle scalar selection with array input
- Bug report from gitter
-
3336 - Enable automatic parallel execution in pre-compiled code
- entangled with general problems with AOT and caching with parfors
- Also highlights missing compiler flag options with AOT compiling.
-
3334 - Copy to host broken for
__cuda_array_interface__
-derived strided DeviceNDArray.- PR already submitted
-
3333 - Indexing broken for
__cuda_array_interface__
-derived DeviceNDArray.- PR already submitted
-
3332 - Nested jit annotations bug
- Want to mix object mode and nopython mode in unusual way.
-
3331 - jit bug
- Hitting reflection of nested list limitation
- Another reminder that "reflection" is problematic
-
3330 - Type Refactor
- Placeholder for type refactoring brainstorming
-
3328 - Avoid unnecessary compilation in type inference
- Wishlist item from last meeting
- Note @overload vs @jit performance difference. Is compiling the wrapper causing the nopython function body to be compiled twice?
- TODO: check
@overload
vs@njit
speed difference.
-
3327 - CUDA 10 notes
- current Numba seems to work with cudatoolkit 9 on new driver and with new cuda 10 toolkit directly
-
3326 - Heap Allocation Error at parallel execution
- list of list is very common
- how do we parallelize this?
- need legalization pass to raise errors when users do non-threadsafe things
-
3323 - Initial string support
- see discussion above
-
3340 - Fix typo in error name.
- easy fix
-
3337 - [WIP] Support for np.ediff1d
- in progress
-
3335 - Fix memory management of
__cuda_array_interface__
views.- Needs review
-
3325 - [WIP] Add location information to exceptions.
- Segfaults in strange ways on CI
- Very useful, but need to fix
-
3324 - Support for np.nancumsum and np.nancumprod
- Needs review
- 3320 - Support for np.partition
- 3307 - Adding NUMBA_ENABLE_PROFILING envvar, enabling jit event
-
3228 - Reduce redundant module linking
- Stuart will review
-
3162 Support constant dtype string in nopython mode in functions like numpy.empty.
- Need to resolve #3195
-
3160 First attempt at parallel diagnostics
- Stuart will implement Todd's suggestion
-
3134 [WIP] Cfunc x86 abi
- Needs re-review
-
3124 Fix 3119, raise for 0d arrays in reductions
- Stuart needs to implement feedback
- 3046 Pairwise sum implementation.
- #2999 Support LowLevelCallable
- #2983 [WIP] invert mapping b/w binop operators and the operator module
- #2950 Fix dispatcher to only consider contiguous-ness.
- #2942 Fix linkage nature (declspec(dllexport)) of some test functions
- #2894: [WIP] Implement jitclass default constructor arguments.
- #2817: [WIP] Emit LLVM optimization remarks
===========================
- Type refactoring
- Parallel diagnostics
- LLVM 7
- Initial string support
- Finishing off stalled PRs
- Usual collection of bug fixes and small features