Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random sampling of n data rows #740

Open
alexclaydon opened this issue Apr 30, 2024 · 1 comment
Open

Random sampling of n data rows #740

alexclaydon opened this issue Apr 30, 2024 · 1 comment

Comments

@alexclaydon
Copy link

The -n / --first-n flag was recently introduced to enable running fewer than all test cases on a given run. This runs the first n test cases. Would it be worth considering another flag that enables random sampling of n test cases from the complete set? This would be particularly convenient when testing against very large datasets stored in, e.g., .csv files. At the moment, it's necessary to copy-paste just the cases you want to run into a separate file just for that purpose.

If this feature is a good fit for promptfoo, would it be possible to make it available not just from the CLI, but also in the .yaml test config? Relatedly, would it also make sense to enable use of --first-n from the .taml config as well?

Thanks!

@typpo
Copy link
Collaborator

typpo commented May 1, 2024

This is an interesting idea, and wouldn't be too difficult to implement 👀

Tangentially related heads up - --first-n is being renamed to --filter-first-n in the next release to match other new filtering options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants