Skip to content

rohit901/LLM-Robustness-Reliability

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Analyzing the Robustness and the Reliability of Large Language Models

Hanan Gani, Rohit K Bharadwaj, Muhammad Huzaifa

paper

Official code for our NLP701 Project "Analyzing the Robustness and the Reliability of Large Language Models"

Table of Contents

  1. Abstract
  2. Robustness
  3. Reliability
    1. Code
    2. Data
  4. Contact

Abstract

Large Language Models (LLMs) are rapidly gaining traction in a variety of applications, performing impressively in numerous tasks. Despite their capabilities, there are rising concerns about the safety and the reliability of these systems, particularly when they are exploited by malicious users. This study aims to assess LLMs on two critical dimensions: Robustness and Reliability. For the Robustness component, we evaluate the robustness of LLMs against in-context attacks and adversarial suffix attacks. We further extend our analysis to Large Multi-modal models (LMMs) and examine the effect of visual perturbations on language output. Regarding Reliability, we examine the performance of well-known LLMs by generating passages about individuals from the WikiBio dataset and assessing the incidence of hallucinated responses. Our evaluation employs a black-box protocol conducted in a zero-resource setting. Despite security protocols embedded inside these models, our experiments demonstrate that these models are still vulnerable to different attacks.

Robustness

Reliability

Distribution of passage-level hallucination scores for GPT-3, GPT-4, Vicuna, MistralOrca by using BERTScore.

  • We observe that GPT-3.5, and GPT4 perform much better than existing open-source LLMs in terms of hallucinations as they have lower hallucination scores.
  • Performance difference between GPT-3.5 and GPT-4 is almost negligible when we consider BERTScore to evaluate for hallucinations.

Code

To replicate the environment for the project and run the code, follow these steps:

  1. Clone the Repository:
    git clone git@github.com:rohit901/LLM-Robustness-Reliability.git
    cd LLM-Robustness-Reliability
  2. Create and Activate a Conda Environment:
    conda create --name nlp_project python=3.11
    conda activate nlp_project
  3. Install Pip Packages:
    pip install -r requirements.txt
  4. To Generate Data From GPT3_5/GPT4:
    python reliability/generate_gpt_data.py
    
    Make sure that you have your OpenAI API key saved in a .env file, and place that file in reliability/.env. The content of the file should be:
    OPENAI_API_KEY=sk-<your_api_key>
    
  5. To Evaluate Hallucination using SelfCheckGPT-Prompt:
    python reliability/evaluate_selfcheckgpt_prompt.py
    
  6. To Evaluate Hallucination using BERTScore:
    python reliability/evaluate_bertscore.py
    

Data

The data for the reliability experiments can be downloaded from the following link:

Download Reliability Experiments Data

Contact

About

Group Project for the NLP701 course. We analyse the robustness and reliability of popular LLMs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages