Skip to content

v3.9.0

Latest
Compare
Choose a tag to compare
@umar456 umar456 released this 29 Aug 19:49
· 2 commits to master since this release

v3.9.0

Improvements

  • Add oneAPI backend #3296
  • Add support to directly access arrays on other devices #3447
  • Add asynchronous reduce all functions that return an af_array #3199
  • Add broadcast support #2871
  • Improve OpenCL CPU JIT performance #3257 #3392
  • Optimize thread/block calculations of several kernels #3144
  • Add support for fast math compiliation when building ArrayFire #3334 #3337
  • Optimize performance of fftconvolve when using floats #3338
  • Add support for CUDA 12.1 and 12.2
  • Better handling of empty arrays #3398
  • Better handling of memory in linear algebra functions in OpenCL #3423
  • Better logging with JIT kernels #3468
  • Optimize memory manager/JIT interactions for small number of buffers #3468
  • Documentation improvements #3485
  • Optimize reorder function #3488

Fixes

  • Improve Errors when creating OpenCL contexts from devices #3257
  • Improvements to vcpkg builds #3376 #3476
  • Fix reduce by key when nan's are present #3261
  • Fix error in convolve where the ndims parameter was forced to be equal to 2 #3277
  • Make constructors that accept dim_t to be explicit to avoid invalid conversions #3259
  • Fix error in randu when compiling against clang 14 #3333
  • Fix bug in OpenCL linear algebra functions #3398
  • Fix bug with thread local variables when device was changed #3420 #3421
  • Fix bug in qr related to uninitialized memory #3422
  • Fix bug in shift where the array had an empty middle dimension #3488

Contributions

Special thanks to our contributors:
Willy Born
Mike Mullen