Skip to content

Releases: OpenNMT/CTranslate2

CTranslate2 4.3.0

17 May 08:20
173a0d1
Compare
Choose a tag to compare

New features

Fixes and improvements

  • Fix regression Flash Attention (#1695)

CTranslate2 4.2.1

24 Apr 10:04
0527ef7
Compare
Choose a tag to compare

Note: Because of the increasing of package's size (> 100 MB), the release v4.2.0 was pushed unsuccessfully.

New features

  • Support load/unload for generator/Whisper Attention (#1670)

Fixes and improvements

CTranslate2 4.2.0

10 Apr 11:41
e491a51
Compare
Choose a tag to compare

New features

  • Support Flash Attention (#1651)
  • Implementation of gemm for FLOAT32 compute type with RUY backend (#1598)
  • Conv1D quantization for only CPU (DNNL and CUDA backend is not supported) (#1601)

Fixes and improvements

  • Fix bug tensor parallel (#1643)
  • Use BestSampler when temperature is 0 (#1659)
  • Fix bug gemma (#1660)
  • Optimize loading/unloading time for Translator with cache (#1645)

CTranslate2 4.1.1

12 Mar 08:59
bfa0cb3
Compare
Choose a tag to compare

Fixes and improvements

  • Fix classifiers in setup.py to push pypi package

CTranslate2 4.1.0

11 Mar 16:15
27092e4
Compare
Choose a tag to compare

New features

  • Support Gemma Model (#1631)
  • Support Tensor Parallelism (#1599)

Fixes and improvements

  • Avoid initializing unused GPU (#1633)
  • Read very large tensor by chunk if the size > max value of int (#1636)
  • Update Readme

CTranslate2 4.0.0

15 Feb 12:51
61492e0
Compare
Choose a tag to compare

This major version introduces the breaking change while updating to cuda 12.

Breaking changes

Python

  • Support cuda 12

New features

  • Add feature to_device() in class StorageView in Python to move data between host <-> device

Fixes and improvements

  • Implement Conv1D with im2col and GEMM to improvement in performance
  • Get tokens in the range of the vocab size for LlaMa models
  • Fix loss of performance
  • Update cibuildwheel to 2.16.5

CTranslate2 3.24.0

09 Jan 09:17
c95fd4e
Compare
Choose a tag to compare

New features

  • Support of new option offset to ignore token score of special tokens

CTranslate2 3.23.0

05 Dec 11:33
83caf67
Compare
Choose a tag to compare

New features

  • Support Phi model

Fixes and improvements

  • Fix the conversion for whisper without the "alignment_heads" in the "generation_config.json"
  • Fix forward batch

CTranslate2 3.22.0

22 Nov 21:31
d963499
Compare
Choose a tag to compare

New features

  • Support "sliding window" and "chunking input" for Mistral

Fixes and improvements

  • Take into account the "generation_config.json" and fix "lang_ids" getter for Whisper converter
  • Accept callback even on "generate_tokens" method
  • Fix iomp5 linking with latest Intel OpenAPI on Ubuntu
  • Fixed "decoder_start_token_id" for T5

CTranslate2 3.21.0

09 Nov 16:45
1e37b52
Compare
Choose a tag to compare

New features

  • Minimal Support for Mistral (Loader and Rotary extension for long sequence). No sliding yet
  • Support Distil-Whisper
  • Support Whisper-large-v3