Skip to content

Releases: promptfoo/promptfoo

0.59.1

18 May 17:22
Compare
Choose a tag to compare

0.59.0

18 May 07:03
Compare
Choose a tag to compare

What's Changed

  • fix: python prompts break when using whole file by @typpo in #784
  • feat(webui): add --filter-description option to promptfoo view by @typpo in #780
  • Langfuse need to compile variables by @albertpurnama in #779
  • chore(webui): display prompt and completion tokens by @typpo in #794
  • chore: include full error response in openai errors by @typpo in #791
  • chore: add logprobs to assertion context by @typpo in #790
  • feat: support var interpolation in function calls by @typpo in #792
  • chore: add timestamp to EvaluateSummary by @typpo in #785
  • fix: render markdown in variables too by @typpo in #796
  • feat(bedrock): add support for embeddings models by @typpo in #797
  • fix(vertex): remove leftover dependency on apiKey by @typpo in #798

Full Changelog: 0.58.1...0.59.0

0.58.1

14 May 04:53
Compare
Choose a tag to compare

0.58.0

09 May 19:11
Compare
Choose a tag to compare

Breaking

rouge-type assertions no longer support multiple reference strings. This is due to an update to the underlying rouge package. To check multiple strings, break them into separate assertions.

What's Changed

  • feat: assert-set by @mikkoh in #765
  • feat: add comma-delimited string support for array-type assertion values by @typpo in #755
  • fix: Resolve JS assertion paths relative to configuration file by @Arkham in #756
  • fix: not-equals assertion by @EKranjec in #763
  • fix: upgrade rouge package and limit to strings by @typpo in #764

New Contributors

Full Changelog: 0.57.1...0.58.0

0.57.1

02 May 05:42
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.57.0...0.57.1

0.57.0

01 May 18:44
Compare
Choose a tag to compare

Breaking

The eval --first-n option has been renamed to eval --filter-first-n to match other new filtering options.

What's Changed

  • feat: ability to override provider per test case by @typpo in #725
  • feat: eval tests matching pattern by @mikkoh in #735
  • feat: add -n limit arg for promptfoo list by @typpo in #749
  • feat: promptfoo import and promptfoo export commands by @typpo in #750
  • feat: add support for --var name=value cli option by @typpo in #745
  • feat: promptfoo eval --filter-failing outputFile.json by @mikkoh in #742
  • fix: eval --first-n arg by @typpo in #734
  • chore: Update openai package to 3.48.5 by @matteodepalo in #739
  • chore: include logger and cache utils in javascript provider context by @typpo in #748
  • chore: add PROMPTFOO_FAILED_TEST_EXIT_CODE envar by @typpo in #751
  • docs: Document python: prefix when loading assertions in CSV by @efung in #731
  • docs: update README.md by @eltociear in #733
  • docs: Fixes to Python docs by @jamesbraza in #728
  • docs: Update to include --filter-* cli args by @mikkoh in #747

New Contributors

Full Changelog: 0.56.0...0.57.0

0.56.0

28 Apr 17:56
Compare
Choose a tag to compare

What's Changed

  • feat: Intergration with Langfuse by @tam0201 in #707
  • feat(webui): improved comment dialog by @typpo in #713
  • feat: Support IBM Research BAM provider by @abratnap in #711
  • fix: Make errors uncached in Python completion. by @grahl in #706
  • fix(vertex/gemini): support nested generationConfig by @typpo in #714
  • fix: include python tracebacks in python errors by @typpo in #724
  • fix: getCache should return a memory store when disk caching is disabled by @typpo in #715
  • chore(webui): improve eval view performance by @typpo in #719
  • chore(webui): always show provider in header by @typpo in #721
  • chore: add support for OPENAI_BASE_URL envar by @typpo in #717

New Contributors

Full Changelog: 0.55.0...0.56.0

0.55.0

24 Apr 03:38
Compare
Choose a tag to compare

What's Changed

  • [Docs] Add llama3 example to ollama docs by @chanonroy in #695
  • bugfix in answer-relevance by @alexandres in #697
  • feat: add support for provider transform property by @typpo in #696
  • feat: add support for provider-specific delays by @typpo in #699
  • feat: portkey.ai integration by @typpo in #698
  • feat: eval -n arg for running the first n test cases by @typpo in #700
  • feat: ability to write outputs to google sheet by @typpo in #701
  • feat: first-class support for openrouter by @typpo in #702
  • Fix concurrent cache request behaviour by @chrisprice in #703

New Contributors

Full Changelog: 0.54.1...0.55.0

0.54.1

20 Apr 04:50
Compare
Choose a tag to compare

What's Changed

  • Add support for Mixtral 8x22B by @streichsbaer in #687
  • fix: google sheets async loading by @typpo in #688
  • fix: trim spaces in csv assertions that can have file:// prefixes by @typpo in #689
  • fix: apply thresholds to custom python asserts by @typpo in #690
  • fix: include detail from external python assertion by @typpo in #691
  • chore(webui): allow configuration of results per page by @typpo in #694
  • fix: ability to override rubric prompt for all model-graded metrics by @typpo in #692

Full Changelog: 0.54.0...0.54.1

0.54.0

18 Apr 04:56
Compare
Choose a tag to compare

What's Changed

  • feat: support for authenticated google sheets access by @typpo in #686
  • fix: bugs in Answer-relevance calculation by @anthonyivn2 in #683
  • fix: Add tool calls to response from azure openai by @CamdenClark in #685

Full Changelog: 0.53.0...0.54.0