Skip to content

ONNX Runtime v1.17.3

Latest
Compare
Choose a tag to compare
@sophies927 sophies927 released this 18 Apr 15:46
· 2 commits to rel-1.17.3 since this release
56b660f

What's new?

General:

  • Update copying API header files to make Linux logic consistent with Windows (#19736) - @mszhanyi
  • Pin ONNX version to fix DML and Python packaging pipeline exceptions (#20073) - @mszhanyi

Build System & Packages:

  • Fix minimal build with training APIs enabled bug affecting Apple framework (#19858) - @edgchen1

Core:

CUDA EP:

TensorRT EP:

Web:

Windows:

  • Fix Windows memory mapping bug affecting some larger models (#19623) - @yufenglee

Kernel Optimizations:

  • Fix GQA and Rotary Embedding bugs affecting some models (#19801, #19874) - @aciddelgado
  • Update replacement of MultiHeadAttention (MHA) and GroupQueryAttention (GQA) (#19882) - @kunal-vaishnavi
  • Add support for packed QKV input and Rotary Embedding with sm<80 using Memory Efficient Attention kernel (#20012) - @aciddelgado

Models:

This patch release also includes additional fixes by @spampana95 and @enximi. Big thank you to all our contributors!