Skip to content

A Security Testing Suite for Large Language Models

License

Notifications You must be signed in to change notification settings

nnxmms/LLM-Security

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-Security

A Security Testing Suite for Large Language Models

Introduction

LLM-Security is my personal security benchmark for testing large language models. It contains various exploits which make the models behave maliciously.

Warning

This collection is intended for educational purposes only. Do NOT use it for illegal activities.

Getting Started

To get started with LLM-Security, follow these steps:

  1. Clone the repository
git clone https://github.com/nnxmms/LLM-Security.git
  1. Change directory to the downloaded repository
cd LLM-Security
  1. Create a virtual environment
virtualenv -p python3.11 env
  1. Activate the environment
source ./env/bin/activate
  1. Install requirements
pip install -r requirements.txt
  1. Register at OpenAI to obtain a OPENAI_API_KEY

  2. Create a .env file and update the values

cp .env.example .env

Usage

Now you can run the benchmark with the following command

python3 benchmark.py

The results will be stored in a benchmark.json file

OpenAI Example .env

# General
VENDOR=openai
MODELNAME=gpt-3.5-turbo

# OpenAI
OPENAI_API_KEY=sk-...

Ollama Example .env

# General
VENDOR=ollama
MODELNAME=llama3:instruct

# OpenAI
OPENAI_API_KEY=

Exploits

This table provides an overview of all exploits that are used within this benchmark.

Paper Link
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation Hacking and Security - Persona Modulatoin
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Hacking and Security - ArtPrompt

About

A Security Testing Suite for Large Language Models

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages