Analyzing the Robustness and the Reliability of Large Language Models

Hanan Gani, Rohit K Bharadwaj, Muhammad Huzaifa

Official code for our NLP701 Project "Analyzing the Robustness and the Reliability of Large Language Models"

Abstract

Large Language Models (LLMs) are rapidly gaining traction in a variety of applications, performing impressively in numerous tasks. Despite their capabilities, there are rising concerns about the safety and the reliability of these systems, particularly when they are exploited by malicious users. This study aims to assess LLMs on two critical dimensions: Robustness and Reliability. For the Robustness component, we evaluate the robustness of LLMs against in-context attacks and adversarial suffix attacks. We further extend our analysis to Large Multi-modal models (LMMs) and examine the effect of visual perturbations on language output. Regarding Reliability, we examine the performance of well-known LLMs by generating passages about individuals from the WikiBio dataset and assessing the incidence of hallucinated responses. Our evaluation employs a black-box protocol conducted in a zero-resource setting. Despite security protocols embedded inside these models, our experiments demonstrate that these models are still vulnerable to different attacks.

Robustness

Reliability

We observe that GPT-3.5, and GPT4 perform much better than existing open-source LLMs in terms of hallucinations as they have lower hallucination scores.
Performance difference between GPT-3.5 and GPT-4 is almost negligible when we consider BERTScore to evaluate for hallucinations.

Code

To replicate the environment for the project and run the code, follow these steps:

Clone the Repository:

git clone git@github.com:rohit901/LLM-Robustness-Reliability.git
cd LLM-Robustness-Reliability

Create and Activate a Conda Environment:

conda create --name nlp_project python=3.11
conda activate nlp_project

Install Pip Packages:
```
pip install -r requirements.txt
```
To Generate Data From GPT3_5/GPT4:
```
python reliability/generate_gpt_data.py
```
Make sure that you have your OpenAI API key saved in a .env file, and place that file in reliability/.env. The content of the file should be:
```
OPENAI_API_KEY=sk-<your_api_key>
```

To Evaluate Hallucination using SelfCheckGPT-Prompt:

python reliability/evaluate_selfcheckgpt_prompt.py

To Evaluate Hallucination using BERTScore:

python reliability/evaluate_bertscore.py

Data

The data for the reliability experiments can be downloaded from the following link:

Download Reliability Experiments Data

Contact

For queries regarding Reliability part, contact rohit.bharadwaj@mbzuai.ac.ae
For queries regarding Robustness part, contact hanan.ghani@mbzuai.ac.ae, Muhammad.Huzaifa@mbzuai.ac.ae

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
reliability		reliability
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reliability

reliability

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Analyzing the Robustness and the Reliability of Large Language Models

Table of Contents

Abstract

Robustness

Reliability

Code

Data

Contact

About

Releases

Packages

Languages

License

rohit901/LLM-Robustness-Reliability

Folders and files

Latest commit

History

Repository files navigation

Analyzing the Robustness and the Reliability of Large Language Models

Table of Contents

Abstract

Robustness

Reliability

Code

Data

Contact

About

Resources

License

Stars

Watchers

Forks

Languages