Skip to content

Intel® Extension for Transformers v1.4.1 Release

Compare
Choose a tag to compare
@kevinintel kevinintel released this 21 Apr 08:38
· 52 commits to main since this release
0fc6e01

Highlights
Improvements
Examples
Bug Fixing

Highlights

  • Support Weight-only Quantization on MTL iGPU
  • Upgrade lm-eval to 0.4.2
  • Support Llama3

Improvements

  • Support TPP for Xeon Tensor Parallel (5f0430f )
  • Refine Model from_pretrained When use_neural_speed (39ecf38e )

Examples

  • Add vision front-end demo (1c6550 )
  • Add example for table extraction, and enabled multi-page table handling pipeline (db9e6fb )
  • Adapted textual inversion distillation for quantization example to latest transformers and diffusers packages (0ec83b1 )
  • Update NeuralChat Notebooks (83bb65a, 629b9d4 )

Bug Fixing

  • Fix QBits actshuf buf overflow under large batch (a6f3ab3 )
  • Fix TPP support for single socket (a690072 )
  • Fix retrieval dependency (281b0a3 )
  • Fix loading issue of woq model with parameters (37f9db25 )

Validated Configurations

  • Python 3.10
  • Ubuntu 22.04
  • PyTorch 2.2.0+cpu
  • Intel® Extension for Torch 2.2.0+cpu