llm_test

Purpose

llm_test uses pytest to do repeatable and scalable user acceptance API testing of Large Language Models (LLMs) for bias, safety, trust, and security. Beyond acceptance testing and informing further manual tests, output like this could be useful for documentation like ModelCard++.

Usage

Define an importable Model template based on your API requirements. Examples are included for HuggingFace Inference APIs and OpenAI. For APIs require authentication, store your API keys in .env.
Add tests to the test directory. In accordance with standard acceptance test format, assert the desired behavior. Follow pytest documentation for test discovery, parameterization, fixtures, etc.
Modify tests to reference your templated Model.
Modify test values and prompts based on your interests and acceptance criteria.
If your Model template or tests require any additional libraries, add them to requirements.txt.
Build the container: docker build --tag llm_test ..
Run the container: docker run --env-file .env llm_test:latest (after adding your API keys to .env). If you want to modify pytest's behavior, do so in the Dockerfile.
Review Results

Existing Tests

test/test_counterfactual_sentiment.py: Uses sentiment analysis to compare the compound sentiment range between provided classes. Currently there is an arbitrary assert threshold. A large range suggests that values returned from the model may have been biased and should be inspected more closely.
test/test_prompt_injection.py:test_prompt_injection_echo_original: Based on available research, reveals underlying prompt that may have been concatenated with user input.
test/test_prompt_injection.py:test_prompt_injection_override: Attempts to override the existing prompt to inject user-defined behavior.

References

Significantly motivated by the research of:

https://twitter.com/goodside

https://twitter.com/simonw

https://twitter.com/hwchase17

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
test		test
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
hf_template.py		hf_template.py
oai_template.py		oai_template.py
requirements.txt		requirements.txt
results.JPG		results.JPG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test

test

.env

.env

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

hf_template.py

hf_template.py

oai_template.py

oai_template.py

requirements.txt

requirements.txt

results.JPG

results.JPG

Repository files navigation

llm_test

Purpose

Usage

Existing Tests

References

About

Releases

Packages

Languages

License

JosephTLucas/llm_test

Folders and files

Latest commit

History

Repository files navigation

llm_test

Purpose

Usage

Existing Tests

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages