GitHub - chiral-carbon/inverse-prompt: Inverse prompting LLMs for interpretability

Description

[This is a work in progress]

Using Metropolis-Hastings to recover the LLM prompts, given its responses. We use BERT based models for the MH proposal distribution. Currently using the RoBERTa model to sample proposal tokens.

Sequence probabilities were originally computed using GPT3.5, now switched to Llama-2 (7B).

Prerequisites

Since the code is submitted as a job on Greene HPC, please refer to this Greene singularity setup guide.

Then run your singularity container with the command:

$ singularity exec --overlay /scratch/netID/PATH/TO/OVERLAY/IMG:rw /scratch/work/public/singularity/cuda11.8.86-cudnn8.7-devel-ubuntu22.04.2.sif /bin/bash

This code is in Python 3.11, so keep this python version when creating the container.

Inside singularity conda environment (after following the setup), install PyTorch:

$ conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia

For the remaining libraries:

$ pip install -r requirements.txt

The Llama-2 7b model used in the code is hosted on NYU Greene HPC at the location: /vast/work/public/ml-datasets/llama-2/Llama-2-7b-hf.

Running the algorithm

The file mh.py contains the test code that presently runs 1 example (batch size=1).

Submit the following command after exiting the singularity container:

$ sbatch run.sh

to run this as a sbatch job.

Before submitting, change the email ID, overlay location and image location to your own.

Change the GPU/CPU and memory resources as required.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
llama-files		llama-files
.gitignore		.gitignore
LICENSE		LICENSE
MH-BART.ipynb		MH-BART.ipynb
MH-RoBERTa.ipynb		MH-RoBERTa.ipynb
README.md		README.md
mh.py		mh.py
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-files

llama-files

.gitignore

.gitignore

LICENSE

LICENSE

MH-BART.ipynb

MH-BART.ipynb

MH-RoBERTa.ipynb

MH-RoBERTa.ipynb

README.md

README.md

mh.py

mh.py

requirements.txt

requirements.txt

run.sh

run.sh

Repository files navigation

Description

Prerequisites

Running the algorithm

About

Releases

Packages

Languages

License

chiral-carbon/inverse-prompt

Folders and files

Latest commit

History

Repository files navigation

Description

Prerequisites

Running the algorithm

About

Topics

Resources

License

Stars

Watchers

Forks

Languages