Skip to content

Halide v15.0.1

Compare
Choose a tag to compare
@steven-johnson steven-johnson released this 07 Apr 23:21
· 1 commit to release/15.x since this release
4c63f1b

What's Changed

  • The Python binding of compile_to_callable() was not properly copying from device to host for output buffers, so output was typically black (or garbage) when used with a GPU target. (#7213)
  • The bin directory was missing from the installs.
  • Upgraded LLVM to 15.0.7
  • New in 15.0.0, but restated here for visibility: The target flag disable_llvm_loop_opt is deprecated, as it's now the default behavior. This means that we have turned off llvm's autovectorization and loop unrolling. This should not affect any schedules with manually-specified vectorization and unrolling, other than trimming code size a little. However, schedules that do not vectorize or unroll may slow down because they were (intentionally or not) relying on llvm to do it automatically. If you see a performance regression with Halide 15, try turning on the enable_llvm_loop_opt target flag.