Skip to content

Releases: tatsu-lab/alpaca_eval

Release v0.6.2

19 Apr 06:28
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.1...v0.6.2

Release v0.6.1

13 Apr 05:40
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6...v0.6.1

Release v0.6

20 Mar 02:50
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.4...v0.6

Release v0.5.4

24 Feb 08:56
Compare
Choose a tag to compare

What's Changed

  • Add Qwen1.5-72B-Chat to AlpacaEval by @Lukeming-tsinghua in #226
  • Add claude-instant-1.2, deepseek-llm-67b-chat, wizardlm-70b, Qwen-14B-Chat (config + outputs without annotations) by @gblazex in #228
  • [DATA] Adding annotations for the arena models by @YannDubs in #229
  • Update README.md - Add missing "Y" to "ou" by @yoderj in #230
  • [DEV] Analyzing length-controlled metrics. by @YannDubs in #231
  • [DOC] add annotation interpretation by @YannDubs in #232
  • [DATA] add results from the Arena openai models by @YannDubs in #234
  • update ELO for llama-2-13b-chat-hf by @gblazex in #235
  • [NOTEBOOK] add length-corrected GLM by @YannDubs in #237
  • [ENH] add inverse mapper to make sure in and out types are the same by @YannDubs in #240
  • [ENH] update to allow AF to use AE by @YannDubs in #241

New Contributors

Full Changelog: v0.5.3...v0.5.4

Release v0.5.3

01 Feb 08:54
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.2...v0.5.3

Release v0.5.2

10 Jan 23:57
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.5.1...v0.5.2

Release v0.5.1

10 Jan 06:16
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.5.0...v0.5.1

Release v0.5.0

10 Jan 02:32
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.3.6...v0.5.0

Release v0.3.6

24 Nov 22:50
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.3.5...v0.3.6

Release v0.3.5

16 Nov 23:19
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.3.3...v0.3.5