Releases · huggingface/optimum

09 May 11:17

regisss

v1.19.2

65889fb

v1.19.2: Patch release Latest

Latest

Update the Transformers dependency in the Habana extra #1851 @regisss

Full Changelog: v1.19.1...v1.19.2

Contributors

regisss

Assets 2

24 Apr 12:35

echarlaix

v1.19.1

39839c1

v1.19.1: Patch release

Bump transformers version by @echarlaix in #1824
Remove call to apt update before apt purge in the main doc build workflow by @regisss in #1830

Full Changelog: v1.19.0...v1.19.1

Contributors

regisss and echarlaix

Assets 2

16 Apr 13:45

fxmarty

v1.19.0

fb203b1

v1.19.0: Musicgen, MarkupLM ONNX export

Extended ONNX export

Musicgen and MarkupLM models from Transformers can now be exported to ONNX through optimum-cli export onnx. Musicgen ONNX export is used to run the model locally in a browser through transformers.js.

Musicgen ONNX export (text-conditional only) by @fxmarty in #1779
Add support for markuplm ONNX export by @pogzyb in #1784

Other changes and bugfixes

Fix IR version for merged ONNX decoders by @fxmarty in #1780
Update test model id by @echarlaix in #1785
Add Nvidia and Neuron to README by @JingyaHuang in #1791
adds debug options to dump onnx graphs by @prathikr in #1789
Improve PR template by @fxmarty in #1799
Add Google TPU to the mix by @mfuntowicz in #1797
Add redirection for Optimum TPU by @regisss in #1801
Add Nvidia and Neuron to the installation doc by @JingyaHuang in #1803
Update installation instructions by @echarlaix in #1806
Fix offline compatibility by @fxmarty in #1805
Remove unnecessary constants for > 2GB ONNX models by @fxmarty in #1808
Add onnx export function for pix2struct model by @naormatania in #1815

New Contributors

@pogzyb made their first contribution in #1784
@naormatania made their first contribution in #1815

Full Changelog: v1.18.0...v1.19.0

Contributors

mfuntowicz, naormatania, and 6 other contributors

Assets 2

09 Apr 09:58

JingyaHuang

v1.18.1

80862d1

v1.18.1: Patch release

Fix the installation for Optimum Neuron v0.0.21 release

Improve the installation of optimum-neuron through optimum extras #1778

Fix the task inference of stable diffusion

Fix infer task for stable diffusion #1793

Full Changelog: v1.18.0...v1.18.1

Assets 2

25 Mar 13:32

echarlaix

v1.18.0

bc28e60

v1.18.0: Gemma, OWLv2, MPNet Qwen2 ONNX support

New architectures ONNX export :

OWLv2 by @xenova in #1689
Gemma by @fxmarty in #1714
MPNet by @nathan-az in #1471
Qwen2 by @uniartisan in #1746

Other changes and bugfixes

Fix starcoder ORT integration by @fxmarty in #1722
Fix use_auth_token with ORTModel by @fxmarty in #1740
Fix compatibility with transformers v4.39.0 by @echarlaix in #1764

Contributors

fxmarty, xenova, and 3 other contributors

Assets 2

18 Feb 02:19

regisss

v1.17.1

d03ab10

v1.17.1: Patch release

Update Transformers dependency for the release of Optimum Habana v1.10.2

Update Transformers dependency in Habana extra #1700

Full Changelog: v1.17.0...v1.17.1

Assets 2

16 Feb 09:22

fxmarty

v1.17.0

bf7e98e

v1.17.0: Improved ONNX support & many bugfixes

ONNX export from `nn.Module`

A function is exposed to programmatically export any nn.Module (e.g. models coming from Transformers, but modified). This is useful in case you need to do some modifications on models loaded from the Hub before exporting. Example:

from transformers import AutoModelForImageClassification
from optimum.exporters.onnx import onnx_export_from_model

model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")

# Here one could do any modification on the model before the export.
onnx_export_from_model(model, output="vit_onnx")

Enable model ONNX export by @echarlaix in #1649

ONNX export with static shapes

The Optimum ONNX export CLI allows to disable dynamic shape for inputs/outputs:

 optimum-cli export onnx --model timm/ese_vovnet39b.ra_in1k  out_vov --no-dynamic-axes

This is useful if the exported model is to be consumed by a runtime that does not support dynamic shapes. The static shape can be specified e.g. with --batch_size 1 . See all the shape options in optimum-cli export onnx --help.

Enable export of model with fixed shape by @mht-sharma in #1643

BF16 ONNX export

The Optimum ONNX export now supports BF16 export on CPU and GPU. Beware though that ONNX Runtime is most often not able to consume the models as some operation are not implemented in this data type, although the exported models comply with ONNX standard. This is useful if you are developing a runtime that consomes BF16 ONNX models.

Example:

optimum-cli export onnx --model bert-base-uncased --dtype bf16 bert_onnx

BF16 support in the ONNX export by @fxmarty in #1654

ONNX export for news models

You can now export to ONNX table-transformer, bart for text-classification.

Add ONNX export for table-transformer by @xenova in #1616
Reactivate BART Onnx Export by @claeyzre in #1666

Sentence Transformers ONNX export

Fix sentence transformers ONNX export by @fxmarty in #1632
Bump sentence-transformers ONNX opset by @fxmarty in #1634
Pass trust_remote_code to sentence transformers export by @xenova in #1677
Fix library detection by @fxmarty in #1690

Timm models support with ONNX Runtime

Timm models can now be run through ONNX Runtime with the class ORTModelForImageClassification:

from urllib.request import urlopen

import timm
import torch
from PIL import Image

from optimum.onnxruntime import ORTModelForImageClassification

# Export the model to ONNX under the hood with export=True.
model = ORTModelForImageClassification.from_pretrained("timm/resnext101_64x4d.c1_in1k", export=True)

# Get model specific transforms (normalization, resize).
data_config = timm.data.resolve_data_config(pretrained_cfg=model.config.pretrained_cfg)
transforms = timm.data.create_transform(**data_config, is_training=False)

img = Image.open(
    urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png")
)
output = model(transforms(img).unsqueeze(0)).logits
top5_probabilities, top5_class_indices = torch.topk(torch.softmax(output, dim=1) * 100, k=5)

Add Timm support in ORTModelForImageClassification by @mht-sharma in #1578

Other changes and bugfixes

Modify SEW-D model for tests by @echarlaix in #1601
Add phi and mixtral model type to normalizedconfig by @changwangss in #1625
Remove "to ONNX" from info message when exporting model by @helena-intel in #1627
Modify model id for test by @echarlaix in #1628
Fix cupy detection by @fxmarty in #1635
Fix ORT detection by @fxmarty in #1636
Enable sdpa export for SD unet component by @echarlaix in #1637
[ORT] Improve dummy mask & add tips for attention fusion in the doc by @JingyaHuang in #1640
Improve error message by @Almonok in #1623
Add input_labels input to SAM model export by @xenova in #1638
Fix c4 dataset loading by @SunMarc in #1646
Avoid loading onnx file in weight deduplication if not necessary by @fxmarty in #1648
Allow lower ONNX opsets by @fxmarty in #1650
Remove abstract decorator from _export by @JingyaHuang in #1652
Add rjieba install by @mht-sharma in #1661
Fix wikitext2 processing by @SunMarc in #1663
Fix: local variable 'dataset' referenced before assignment by @hiyouga in #1600
Support float16 images in StableDiffusionXLWatermarker by @jambayk in #1603
Extend autocast check to cover more platforms like XPU by @hoshibara in #1639
Support IO Binding for ORTModelForCTC by @vidalmaxime in #1629
Add fp16 support for split cache by @PatriceVignola in #1602
ORTModelForFeatureExtraction always exports as transformers models by @fxmarty in #1684
Avoid overriding model_type in TasksManager by @fxmarty in #1647
Fix gptq device_map = "cpu" by @SunMarc in #1662
CI: Avoid iterating over a mutated iterable by @fxmarty in #1683
Add option to disable ONNX constant folding by @fxmarty in #1682
re-enable decoder sequence classification by @dwyatte in #1679
Move & rename onnx_export by @fxmarty in #1685
Update standardize_model_attributes by @mht-sharma in #1686
Fix: AttributeError: module 'packaging' has no attribute 'version' by @soulteary in #1660
Disable failing test & free space when building documentation by @fxmarty in #1693
Fix no space left on device in actions by @fxmarty in #1694
Add end-to-end Marlin benchmark by @fxmarty in #1695
Fix main doc build by @fxmarty in #1697
Update optimum-intel requirements by @echarlaix in #1699

New Contributors

@tomaarsen made their first contribution in #1597
@helena-intel made their first contribution in #1627
@Almonok made their first contribution in #1623
@hiyouga made their first contribution in #1600
@jambayk made their first contribution in #1603
@hoshibara made their first contribution in #1639
@vidalmaxime made their first contribution in #1629
@PatriceVignola made their first contribution in #1602
@claeyzre made their first contribution in #1666
@dwyatte made their first contribution in #1679
@soulteary made their first contribution in #1660

Full Changelog: v1.16.0...v1.17.0

Contributors

soulteary, dwyatte, and 16 other contributors

Assets 2

19 Jan 16:01

echarlaix

v1.16.2

8651c0c

v1.16.2: Patch release

Fix ORT training compatibility for transformers v4.36.0 by @AdamLouly #1586
Fix ONNX export compatibility for transformers v4.37.0 by @echarlaix #1641

Contributors

AdamLouly and echarlaix

Assets 2

15 Dec 10:54

fxmarty

v1.16.1

0558f21

v1.16.1: Patch release

Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated

The features from BetterTransformer for Llama, Falcon, Whisper and Bart have been upstreamed in Transformers. Please use transformers>=4.36 and torch>=2.1.1 to use by default PyTorch's scaled_dot_product_attention.

More details: https://github.com/huggingface/transformers/releases/tag/v4.36.0

What's Changed

Update dev version by @fxmarty in #1596
Typo: tansformers -> transformers by @tomaarsen in #1597
[GPTQ] fix tests by @SunMarc in #1598
Show correct error message on using BT for SDPA models by @fxmarty in #1599

New Contributors

@tomaarsen made their first contribution in #1597

Full Changelog: v1.16.0...v1.16.1

Contributors

fxmarty, tomaarsen, and SunMarc

Assets 2

13 Dec 18:23

fxmarty

v1.16.0

c6ce536

v1.16.0: Transformers 4.36 compatibility, extended ONNX support, Mixtral GPTQ

Transformers 4.36 compatiblity

Notably, the ONNX exports aten::scaled_dot_product_attention in a standardized way for the compatible models.

Compatibility with Transformers 4.36 by @fxmarty in #1590

Extended ONNX support: timm, sentence-transformers, Phi, ESM

Add ONNX export for phi models by @xenova in #1579
Add ESM onnx support by @xenova in #1581
Add timm models export by @mht-sharma in #1587
Proper sentence-transformers ONNX export support by @fxmarty in #1589

GPTQ for Mixtral

Work in progress.

add modules_in_block_to_quantize arg for gptq by @SunMarc in #1585

What's Changed

Update version to 1.16.0.dev0 by @fxmarty in #1571
Use doc links in the README for subpackages by @fxmarty in #1572
Fix GPTQ compatibility with AutoGPTQ by @fxmarty in #1574
Refactoring EC2 CIs by @JingyaHuang in #1575
Remove inputs from sentence-transformers ONNX output by @fxmarty in #1593
Gptq tokenized dataset by @SunMarc in #1584
Run timm ONNX CI only once per day by @fxmarty in #1594
Run timm ONNX CI nightly v2 by @fxmarty in #1595

Full Changelog: v1.15.0...v1.16.0

Contributors

fxmarty, mht-sharma, and 3 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Contributors

Extended ONNX export

Other changes and bugfixes

New Contributors

Contributors

Fix the installation for Optimum Neuron v0.0.21 release

Fix the task inference of stable diffusion

New architectures ONNX export :

Other changes and bugfixes

Contributors

Update Transformers dependency for the release of Optimum Habana v1.10.2

ONNX export from `nn.Module`

ONNX export with static shapes

BF16 ONNX export

ONNX export for news models

Sentence Transformers ONNX export

Timm models support with ONNX Runtime

Other changes and bugfixes

New Contributors

Contributors

Contributors

Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated

What's Changed

New Contributors

Contributors

Transformers 4.36 compatiblity

Extended ONNX support: timm, sentence-transformers, Phi, ESM

GPTQ for Mixtral

What's Changed

Contributors

Releases: huggingface/optimum

v1.19.2: Patch release

Contributors

v1.19.1: Patch release

Contributors

v1.19.0: Musicgen, MarkupLM ONNX export

Extended ONNX export

Other changes and bugfixes

New Contributors

Contributors

v1.18.1: Patch release

Fix the installation for Optimum Neuron v0.0.21 release

Fix the task inference of stable diffusion

v1.18.0: Gemma, OWLv2, MPNet Qwen2 ONNX support

New architectures ONNX export :

Other changes and bugfixes

Contributors

v1.17.1: Patch release

Update Transformers dependency for the release of Optimum Habana v1.10.2

v1.17.0: Improved ONNX support & many bugfixes

ONNX export from nn.Module

ONNX export with static shapes

BF16 ONNX export

ONNX export for news models

Sentence Transformers ONNX export

Timm models support with ONNX Runtime

Other changes and bugfixes

New Contributors

Contributors

v1.16.2: Patch release

Contributors

v1.16.1: Patch release

Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated

What's Changed

New Contributors

Contributors

v1.16.0: Transformers 4.36 compatibility, extended ONNX support, Mixtral GPTQ

Transformers 4.36 compatiblity

Extended ONNX support: timm, sentence-transformers, Phi, ESM

GPTQ for Mixtral

What's Changed

Contributors

ONNX export from `nn.Module`