Importing config vars from a .csv file #679

alexclaydon · 2024-04-16T04:59:58Z

Hello, having had a look through the documentation (Import vars from separate files) and examples, it appears as though importing vars from external files is limited to yaml, txt, Python and Javascript files.

Are there any plans to allow import from a csv file, where each column header is a variable name, and each row is one single complete configuration? We feel this would be particularly helpful when composing "scenarios" where you have complex setup config to apply to each test.

typpo · 2024-04-17T15:04:07Z

Hey @alexclaydon, is it possible this import from CSV functionality does what you want?

It works at the tests level rather than vars, but you can include it in your promptfoo config like this:

tests: my_tests.csv

And the CSV includes a bunch of vars mapped to columns.

alexclaydon · 2024-04-19T04:19:11Z

Thanks Ian!

First of all, we're loving promptfoo and are going all in on it - thanks for all of your excellent work!

Having thought about this issue for a few days and discussed with a colleague, we realised that I should have framed it differently. Please allow me to provide some additional context on the use-case to flesh out the issue as we see it. It's always possible that we've misunderstood the paradigm.

We're making extensive use of "scenarios" in our setup, like so:

description: ...
providers: ...
prompts: file://prompt_templates.txt  # prompt template containing nunjucks tags for `topic` and `description`

scenarios:
  - config:
    - vars:
        topic: "Liabilities for Business Losses"
        description: file://case01.txt
    - vars:
        topic: "Liabilities for Data Losses"
        description: file://case02.txt
    - vars:
        topic: "Liabilities for Other Losses"
        description: file://case03.txt

    tests: './data.csv'  # contains 4 columns: `text` (str), `definitions` (str), `data_loss` (bool), `business_loss` (bool)

The crux of the issue for us is that, in scenarios, it doesn't seem to be possible to specify assertions under the config section to be applied (ceteris paribus) to each row - they must instead be in-lined row-by-row in the .csv data file itself.

Accordingly, if we want to check, for example, whether some output matches the value of one of the bool columns in our data spreadsheet, we need to go down each row and change the name of that column to __expected and its value from "TRUE" to (for example) "equals: TRUE". Given that our data spreadsheets are often the main interface between technical and non-technical teams, that's quite tricky for us.

Ideally, we'd prefer to write a one-line "wrapper" assertion, leveraging vars set out in the .csv file (and elsewhere), and just drop it into the config section instead, where it would then be applied on a row by row basis.

I hope that all makes sense and would really appreciate any guidance you might be able to provide!

alexclaydon · 2024-04-30T10:15:44Z

Just had an additional thought on your response above which might clear things up for us a bit.

You mentioned that "the CSV includes a bunch of vars mapped to columns." Does that imply that I can access column names / vars from within the config file in an assert immediately following tests: my_tests.csv, like so (or something similar)?

description: 'Extraction'

prompts: ['{{ clause }}']

providers:
  - id: 'python:../shims/extract.py'
    label: 'Extraction'
    config:
      evaluateOptions.maxConcurrency: 1
      evaluateOptions.showProgressBar: true

tests:
  - '../testdata/data.csv'
  - assert:
    - type: icontains-any
      value: '{{ entities }}'  # where the `.csv` file contains at least `clause` and `entities` columns

I've been trying to get that working today, but wasn't able - I also had a look through the examples directory, but couldn't find anything on-point.

If this kind of thing is possible, that would enable us to conveniently "wrap" asserts around row data as I described above - so a complete solution!

typpo · 2024-05-01T18:41:12Z

I think the right construction here is:

defaultTest:
  assert:
    - type: icontains-any
      value: '{{ entities }}'

tests: '../testdata/data.csv'

The defaultTest object will apply to all test cases, and should be able to pick up columns from your CSV.

alexclaydon · 2024-05-02T04:18:38Z

Thank you Ian!

The confirmation that this is supposed to work is very helpful. I had actually tried this approach, but without success. I think I now know why: it appears to relate to the way arrays are handled.

Since this particular assert is icontains-any, I had set up my .yaml file exactly as you have it above. I then tried setting up the entities column in my .csv file a few different ways: (i) including / excluding square brackets ([]), (ii) including / excluding single and double quote marks around each string in the array, and (iii) trying both commas and semi-colons as delimiters. All of those approaches failed with a message to the effect that this particular assert requires an array.

In addition, I tried enclosing the value of the value field in my .yaml file in square brackets, as follows:

defaultTest:
  assert:
    - type: icontains-any
      value: ['{{ entities }}']

tests: '../testdata/data.csv'

That seemed to resolve the above error about the assert requiring an array, but also resulted in the actual value of the entities column being returned as a single string (enclosed in an array), no matter what I tried.

Could you advise on the correct approach here? Unless I missed it, there seems to be a bit of a gap in the documentation here. I would be very happy to submit a documentation PR if we could get to the bottom of this and if that would be helpful?

Cheers

typpo · 2024-05-02T05:40:15Z

Ok, I see the problem! I went through the whole troubleshooting and debugging process (e.g. using __expected, wrapping in brackets) all of which you already explained in great detail :)

I think the best way to fix this is add comma-delimited string support for icontains-any and similar assertion types. That way you'll be able to do:

defaultTest:
  assert:
    - type: icontains-any
      value: '{{ entities }}'

tests: '../testdata/data.csv'

and you can include an entities column in your CSV, and it will work as intended.

#755 implements this (although it naively splits on any comma, and won't respect CSV-style quotes like "hello, world")

alexclaydon · 2024-05-02T06:52:00Z

Perfect! Thanks so much for this, really appreciate the attention. That's the final piece of the puzzle for us on pushing Promptfoo to adoption across our whole team, incredibly helpful!

typpo · 2024-05-07T04:15:42Z

Sorry for the delay here @alexclaydon - it's merged and will be in the next release!

alexclaydon · 2024-05-07T04:17:17Z

Not at all - we really appreciate you talking the time on this!

jerem99 · 2024-05-15T07:46:37Z

@typpo thank you for this explanation, it helped me to be able to use csv files to test on large datasets.
It would be nice if it would be possible to display the values of "entities" in the UI instead of just: {{entites}}, look at the image below:

ThibaultDef · 2024-05-15T09:12:49Z

Hello,

I am also testing LLM from CSV files. However I am led to use a custom Python script to calculate some metrics in the assertions.

In the config.yaml file, the code looks like:

defaultTest:
    assert:
        - type: python
        - value: file://../../src/assertions/custom_metric.py

tests:
    - dataset/*.csv

My project structure looks like this:

├── src
│ ├── assertions
│ │ ├── init.py
│ │ │ ├── custom_metric.py
├── tests
│ └── project
│ ├── config.yaml
│ ├── dataset

The folder src contains all Python dependancies.
The folder tests contains all tests applied by promptfoo.

After executing the command line at the root project:

promptfoo eval -c tests/project/config.yaml

It returns the following error:

TypeError: Cannot read properties of undefined (reading 'startsWith')

Is there a nice hack to make it work ?

typpo mentioned this issue May 2, 2024

feat: add comma-delimited string support for array-type assertion values #755

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Importing config vars from a .csv file #679

Importing config vars from a .csv file #679

alexclaydon commented Apr 16, 2024 •

edited

typpo commented Apr 17, 2024

alexclaydon commented Apr 19, 2024 •

edited

alexclaydon commented Apr 30, 2024 •

edited

typpo commented May 1, 2024

alexclaydon commented May 2, 2024

typpo commented May 2, 2024 •

edited

alexclaydon commented May 2, 2024

typpo commented May 7, 2024

alexclaydon commented May 7, 2024

jerem99 commented May 15, 2024

ThibaultDef commented May 15, 2024 •

edited

Importing config vars from a .csv file #679

Importing config vars from a .csv file #679

Comments

alexclaydon commented Apr 16, 2024 • edited

typpo commented Apr 17, 2024

alexclaydon commented Apr 19, 2024 • edited

alexclaydon commented Apr 30, 2024 • edited

typpo commented May 1, 2024

alexclaydon commented May 2, 2024

typpo commented May 2, 2024 • edited

alexclaydon commented May 2, 2024

typpo commented May 7, 2024

alexclaydon commented May 7, 2024

jerem99 commented May 15, 2024

ThibaultDef commented May 15, 2024 • edited

alexclaydon commented Apr 16, 2024 •

edited

alexclaydon commented Apr 19, 2024 •

edited

alexclaydon commented Apr 30, 2024 •

edited

typpo commented May 2, 2024 •

edited

ThibaultDef commented May 15, 2024 •

edited