Release highlights:
- Features
- Updated cuTENSOR and cuTensorNet versions
- Added configurable print formatting
- ARM FFT support via NVPL
- New operators: abs2(), outer(), isnan(), isinf()
- Many more unit tests for CPU tests
- Bug fixes for matmul on Hopper, 2D FFTs, and more
Full changelist:
What's Changed
- Increase cublas workspace to 32 MiB for Hopper+ by @tbensonatl in #545
- matmul bug fixes. by @luitjens in #547
- Added missing synchronization by @luitjens in #552
- Refine some file I/O functions' doxygen comments by @AtomicVar in #549
- Update docs by @tmartin-gh in #551
- Export used environment variables in sphinx config by @tmartin-gh in #553
- Import os by @tmartin-gh in #554
- Add version info by @tmartin-gh in #555
- Fix typo by @tmartin-gh in #556
- Adds IsNan and IsInf Operators by @nvjonwong in #557
- Use cmake project version info in sphinx config by @tmartin-gh in #560
- outer() operator for outer product by @cliffburdick in #559
- Fix nans in QR and SVD. by @luitjens in #558
- Update CMakeLists.txt by @cliffburdick in #548
- Fix CMake to allow multiple rapids-cmake to coexist by @cliffburdick in #562
- Return 0D arrays for 0D shape in operators by @cliffburdick in #561
- Fix NVTX3 include path by @AtomicVar in #564
- Add .npy File I/O by @AtomicVar in #565
- SVD & QR improvements by @luitjens in #563
- chore: Fix typo s/whereever/wherever/ by @hugo-syn in #566
- Add rapids-cmake-dir, if defined, to CMAKE_MODULE_PATH by @tbensonatl in #567
- Add abs2() operator for squared abs() by @tbensonatl in #568
- Fixed issue on g++13 with nullptr dereference that cannot happen at r… by @cliffburdick in #571
- Force max(min) size of direct convolution dimension to be < 1024 by @cliffburdick in #573
- Remove incorrect warning check for any compiler other than gcc by @cliffburdick in #577
- stream memory cleanup by @cliffburdick in #579
- Update reshape indices by @cliffburdick in #580
- Update matlabpython.rst by @cliffburdick in #583
- Prevent potential oob read in matxOpTDKernel by @tbensonatl in #586
- Broadcast lower-rank tensors during batched matmul by @tbensonatl in #585
- Fix bugs in 2D FFTs and add tests by @benbarsdell in #587
- Added ARM FFT Support by @cliffburdick in #576
- Various bug fixes for older compilers by @cliffburdick in #588
- Renamed rmin/rmax functions to min/max and element-wise are now minimum/maximum to match Python by @cliffburdick in #589
- Fix clang macro by @cliffburdick in #592
- Fix misplaced sentence in README by @lucifer1004 in #594
- Add configurable print formatting types by @tmartin-gh in #593
- Fixing return types to allow either prvalue or lvalue in operator() by @cliffburdick in #598
- Rework einsum for new cache style. Fix for issue #597 by @tmartin-gh in #599
- Updated cutensornet to 24.03 and cutensor to 2.0.1 by @cliffburdick in #600
- adding file name and line number to ease debug by @bhaskarrakshit in #601
- Updating versions and notes for v0.8.0 by @cliffburdick in #602
New Contributors
- @hugo-syn made their first contribution in #566
- @benbarsdell made their first contribution in #587
- @lucifer1004 made their first contribution in #594
- @bhaskarrakshit made their first contribution in #601
Full Changelog: v0.7.0...v0.8.0