Releases · meilisearch/charabia · GitHub

14 Mar 15:55

meili-bot

Tokenizer v0.2.8

Changes

Changes related to the rebranding (#66)
Update LICENSE (#67) @curquiza
Small fix in benches/ (#71) @Thearas
Setup lindera tokenizer for ja support ( related with #49 ) (#70) @miiton
Benchmark and optimize japanese (#73) @ManyTheFish
Decompose Japanese compound words (#75) @mosuka
Update the dependencies (#80) @Kerollmops

Thanks again to @Kerollmops, @ManyTheFish, @Thearas, @curquiza, @miiton and @mosuka! 🎉

Contributors

miiton, mosuka, and 4 other contributors

Assets 2

10 Jan 18:45

meili-bot

Tokenizer v0.2.7

Changes

Count chars instead of graphemes and fix return value when char_map is uninitialized (#64) @Samyak2

Thanks again to @Samyak2 ! 🎉

Contributors

Samyak2

Assets 2

10 Nov 14:02

meili-bot

Tokenizer v0.2.6

Changes

Test Meilisearch issue 1714 (#58) @ManyTheFish
Please exclude Hangul from is_cjk. (#60) @datamaker
Add mapping between bytes in original word and normalized word (#59) @Samyak2

Thanks again to @ManyTheFish, @Samyak2, @datamaker and JB! 🎉

Contributors

ManyTheFish, datamaker, and Samyak2

Assets 2

18 Aug 11:34

meili-bot

Tokenizer v0.2.5

Changes

change ZeroRemover into ControlCharacterRemover (#55) @ManyTheFish
Add a rustfmt config file into the project (#57) @Kerollmops

Thanks again to @Kerollmops, @ManyTheFish, and @curquiza! 🎉

Contributors

Kerollmops, ManyTheFish, and curquiza

Assets 2

22 Jul 14:37

qdequele

Tokenizer v0.2.4

Changes

Introduce a new default normalizer that removes zeroes from tokens (#52) @Kerollmops

Thanks again to @Kerollmops ! 🎉

Assets 2

10 Jun 14:59

meili-bot

Tokenizer v0.2.3

Changes

Make legacy tokenizer handle unicode separators (#47) @ManyTheFish

Thanks again to @ManyTheFish! 🎉

Assets 2

03 May 08:53

meili-bot

Tokenizer v0.2.2

Changes

Fix non-breaking space separator (#44) @shekhirin

Thanks again to @LegendreM, and @shekhirin! 🎉

Assets 2

14 Apr 16:51

meili-bot

Tokenizer v0.2.1

Changes

Add release drafter files (#37) @curquiza
Add bors (#41) @curquiza
Fix separators: treat cyrillic chars as non-separators (#39) @shekhirin

Thanks again to @shekhirin! 🎉

Assets 2

17 Dec 14:43

ManyTheFish

use HMM feature on jieba

Merge pull request #23 from meilisearch/use-hmm-on-jieba

Use hmm on jieba

Assets 2