Skip to content
Stan Seibert edited this page Jul 26, 2018 · 1 revision

Numba Meeting: 2018-07-26

Attendees: Ehsan, Todd, Siu, Stuart, Stan

1. New issues

  • #3169: parfors lowering error in build_gufunc_wrapper ParallelAccelerator
    • Need a better error message
  • #3164: how to use python / numba / @vectorize non-scalar input
    • blocked on ufunc rewrite
  • #3163: @jit for generic jitclasses
    • Good idea, type inference a little tricky, will be rolled into future redesign of jitclasses
  • #3161: Caching for ParallelAccelerator without Threading
    • Siu will take a look
  • #3159: np.eye() does not support dtype argument, which is not stated in the docs and does not return meaningful error code
    • Need better error and updated docs
  • #3158: Structured dtypes with multi-dim fields: numba either throws or crashes
    • Need strict dtype checking and raising of errors
    • Notes in docs about what dtypes
  • #3155: parfors hoist set item fail ParallelAccelerator bug
    • Related to implied accumulation in expression (x = x * 2 rather than x *= 2)
    • Can temporarily skip test to get builds to pass again
    • Need to fix this more properly either detect and error (use inplace operators for accumulators), or detect and do the right thing
  • #3149: calling cuda math functions (eg. sincospi)
    • Will advise user
    • Need to think about where in numba namespace this should go
  • #3146: Storing ctypes c_void_p type result in numpy array: Cannot cast void* to uint64
    • Siu will take a look
  • #3144: [Question] Would it be possible to support larger step size in prange? ParallelAccelerator
    • On wishlist
  • #3141: numba acceleration not working in python-sgp4
    • Stuart will add links to likely related bugs
  • #3140: Jupyter integration. Output compiler log, diagnostic info, etc..
    • Good wishlist ideas, need to see how Jupyter feature ends up
  • #3139: Parfors accumulator reuse bug
    • PR opened to fix
  • #3138: Access the global variable defined in other modules
    • Need to fix, not sure effort level yet

2. Open PRs

New

  • 3168 Py37 bytes output fix.
    • Merged this morning
  • 3167 In build script activate env before installing.
    • Merged this morning
  • 3166 [WIP] Objmode with-block
    • Seeing lots of places that could benefit from refactoring
  • 3165 Add FMA intrinsic support
    • Need to review and test in build farm
    • Is there CPU support for FMA we should add?
  • 3162 Support constant dtype string in nopython mode in functions like numpy.empty.
  • 3160 First attempt at parallel diagnostics
    • Looking for feedback on utility of output, impl is a bit hacky
  • 3153 Fix canonicalize_array_math typing for calls with kw args
  • 3152 [WIP] Use cuda driver api to get best blocksize for best occupancy
    • Once tested on Volta, ready for review
  • 3151 Keep a queue of references to last N deserialized functions. Fixes #3026
  • 3148 Remove dead array equal @infer code
  • 3145 [WIP] support for np.fill_diagonal
  • 3142 Issue3139
  • 3137 Fix for issue3103

Old

  • 3134 [WIP] Cfunc x86 abi

    • Contributor updated patch and tests, need rereview?
  • 3132 Adds an ~5 minute guide to Numba.

    • Stan posted comments
  • 3128 WIP: Fix recipe for jetson tx2/ARM

    • Will merge when ready
  • 3127 Support for reductions on arrays.

    • First pass review, waiting for changes
  • 3124 Fix 3119, raise for 0d arrays in reductions

    • _
  • 3122 WIP: Add inliner to object mode pipeline

    • Fix closure inlining in object mode
    • Needs test and review
  • 3093 [WIP] Singledispatch overload support for cuda array interface.

    • Needs review
  • 3046 Pairwise sum implementation.

  • 3017 Add facility to support with-contexts

  • #2999 Support LowLevelCallable

  • #2983 [WIP] invert mapping b/w binop operators and the operator module

  • #2950 Fix dispatcher to only consider contiguous-ness.

  • #2942 Fix linkage nature (declspec(dllexport)) of some test functions

  • #2894: [WIP] Implement jitclass default constructor arguments.

  • #2817: [WIP] Emit LLVM optimization remarks

===========================

3. Feature Discussion

4. Next Release: Version 0.40, RC=Sept 3, 2018, Final=Sept 10, 2018

  • Experimental python mode blocks
  • Refactored threadpool interface
  • AMD GPU backend
  • Parallel diagnostics
  • Usual collection of bug fixes
Clone this wiki locally