DP-Decoding-in-LLM

Motivation

Large language models are pre-trained on a vast amount of public data collected from the Internet, which very likely contains private or sensitive information. Combined with the fact that large models tend to memorize training data, this scenario poses a potential risk of data leakage.

Differential privacy (DP) is a paradigm that can help in this regard. In the realm of AI models, this method is usually applied during the training phase, but due to the cost of re-training a large model, it is possible to incorporate it at inference time only.

Method

Recall that during inference time, the LLM generates text token by token. At each step, it selects the next token based on the probability distribution it has created over the vocabulary. This process is known as the "decoding strategy". Various decoding strategies exist, with one basic approach being to simply sample a word according to this distribution.

The paper "Differentially Private Decoding in Large Language Models" authored by Majmudar, Jimit, et al. (2022), proposes a straightforward perturbation approach to this decoding strategy. This method applies linear interpolation between the original distribution q and the uniform distribution u:

The parameter lambda controls the trade-off balance between utility (the effectiveness of the generated output) and privacy (the protection of sensitive information). See an example below:

My Experiments

Experiment 1: Sentence Complement task (using GPT-2 model)

T is the number of predicted tokens.

Experiment 2: Visual QA task (using ViLT model)

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
DP_decoding_in_LLMs.ipynb		DP_decoding_in_LLMs.ipynb
README.md		README.md
Utility-privacy tradeoff example.png		Utility-privacy tradeoff example.png
Utility-privacy tradeoff.png		Utility-privacy tradeoff.png
info and memo is leakage.png		info and memo is leakage.png
perturbatiation method.png		perturbatiation method.png
word probabilities.png		word probabilities.png
word_embeddings_3d_lambda_0.001.html		word_embeddings_3d_lambda_0.001.html
word_embeddings_3d_lambda_0.9.html		word_embeddings_3d_lambda_0.9.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly