Skip to content

An implementation of a GPT 4 Turbo powered chatbot with safety guards to avoid a $1 chevrolet incident. It's currently overzealous and won't allow some sensible inputs. It uses two GPTs as well as tripwires in the prompts to avoid Prompt Injection and responding to dangerous requests.

License

Notifications You must be signed in to change notification settings

CADawg/gpt-support-chatbot-with-safety-guards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anti Jailbreak Site Chatbot with GPT-4 Turbo

This project is an attempt at preventing a "$1 chevy sale incident", by using two instances of ChatGPT in order to answer questions. It is very early stage and is overzealous at blocking innocent questions and inputs.

Demo

antijailbreakchatbot-nosound.mp4

How to use

go run .

will launch a webserver on port 7129 or PORT if the environment variable is set. You need to provide your OPENAI_API_KEY as an environment variable or put it into a .env file. The chat interface is not particularly great and shows no indication that a reply is in the works, just wait a few seconds for it to appear.

What can I ask it?

It's currently trained on ConfigDN and so questions should be related to it to avoid rejection. TL:DR; ConfigDN is an Open Source configuration management and feature flag system with Go and JavaScript support currently. Ask questions about that. For more ideas see the prompts in the /prompts folder.

How it works

It first sends the users input to a GPT-4 Turbo instance, and this instance checks for any issues and rewrites the prompt safely for the second instance. The second instance then answers this question if it was considered "safe". There are also tripwires in the prompt to detect an attempted prompt leak.

About

An implementation of a GPT 4 Turbo powered chatbot with safety guards to avoid a $1 chevrolet incident. It's currently overzealous and won't allow some sensible inputs. It uses two GPTs as well as tripwires in the prompts to avoid Prompt Injection and responding to dangerous requests.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published