Skip to content

hongbinye/Cognitive-Mirage-Hallucinations-in-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Awesome License: MIT

Uncritical trust in LLMs can give rise to a phenomenon Cognitive Mirage, leading to misguided decision-making and a cascade of unintended consequences. To effectively control the risk of hallucinations, we summarize recent progress in hallucination theories and solutions in this paper. We propose to organize relevant work by a comprehensive survey.

[:bell: News! :bell: ] We have released a new survey paper:"Cognitive Mirage: A Review of Hallucinations in Large Language Models" based on this repository, with a perspective of Hallucinations in LLMs! We are looking forward to any comments or discussions on this topic :)

🕵️ Introduction

As large language models continue to develop in the field of AI, text generation systems are susceptible to a worrisome phenomenon known as hallucination. In this study, we summarize recent compelling insights into hallucinations in LLMs. We present a novel taxonomy of hallucinations from various text generation tasks, thus provide theoretical insights, detection methods and improvement approaches. Based on this, future research directions are proposed. Our contribution are threefold: (1) We provide a detailed and complete taxonomy for hallucinations appearing in text generation tasks; (2) We provide theoretical analyses of hallucinations in LLMs and provide existing detection and improvement methods; (3) We propose several research directions that can be developed in the future. As hallucinations garner significant attention from the community, we will maintain updates on relevant research progress.

🏆 A Timeline of LLMs

LLM Name Title Authors Publication Date
T5 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Colin Raffel, Noam Shazeer, Adam Roberts 2019.10
GPT-3 Language Models are Few-Shot Learners. Tom B. Brown, Benjamin Mann, Nick Ryder 2020.12
mT5 mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. Linting Xue, Noah Constant, Adam Roberts 2021.3
Codex Evaluating Large Language Models Trained on Code. Mark Chen, Jerry Tworek, Heewoo Jun 2021.7
FLAN Finetuned Language Models are Zero-Shot Learners. Jason Wei, Maarten Bosma, Vincent Y. Zhao 2021.9
WebGPT WebGPT: Browser-assisted question-answering with human feedback. Reiichiro Nakano, Jacob Hilton, Suchir Balaji 2021.12
InstructGPT Training language models to follow instructions with human feedback. Long Ouyang, Jeffrey Wu, Xu Jiang 2022.3
CodeGen CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. Erik Nijkamp, Bo Pang, Hiroaki Hayashi 2022.3
Claude Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. Yuntao Bai, Andy Jones, Kamal Ndousse 2022.4
PaLM PaLM: Scaling Language Modeling with Pathways. Aakanksha Chowdhery, Sharan Narang, Jacob Devlin 2022.4
OPT OPT: Open Pre-trained Transformer Language Models. Susan Zhang, Stephen Roller, Naman Goyal 2022.5
Super-NaturalInstructions ESuper-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi 2022.9
GLM GLM-130B: An Open Bilingual Pre-trained Model. Aohan Zeng, Xiao Liu, Zhengxiao Du 2022.10
BLOOM BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. Teven Le Scao, Angela Fan, Christopher Akiki 2022.11
LLaMA LLaMA: Open and Efficient Foundation Language Models. Hugo Touvron, Thibaut Lavril, Gautier Izacard 2023.2
Alpaca Alpaca: A Strong, Replicable Instruction-Following Model. Rohan Taori, Ishaan Gulrajani, Tianyi Zhang 2023.3
GPT-4 GPT-4 Technical Report. OpenAI 2023.3
WizardLM WizardLM: Empowering Large Language Models to Follow Complex Instructions. Can Xu, Qingfeng Sun, Kai Zheng 2023.4
Vicuna Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality. The Vicuna Team 2023.5
ChatGLM ChatGLM. Wisdom and Clear Speech Team 2023.6
Llama2 Llama 2: Open Foundation and Fine-Tuned Chat Models. Hugo Touvron, Louis Martin, Kevin Stone 2023.7

🏳‍🌈 Definition of Hallucination

  • "Truthful AI: Developing and governing AI that does not lie.", 2021.10

    • Owain Evans, Owen Cotton-Barratt, Lukas Finnveden
    • [Paper]
  • "Survey of Hallucination in Natural Language Generation", 2022.2

    • Ziwei Ji, Nayeon Lee, Rita Frieske
    • [Paper]
  • "Context-faithful Prompting for Large Language Models.", 2023.3

    • Wenxuan Zhou, Sheng Zhang, Hoifung Poon
    • [Paper]
  • "Do Language Models Know When They’re Hallucinating References?", 2023.5

    • Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
    • [Paper]

🎉 Mechanism Analysis

✨Data Attribution

  • "Data Distributional Properties Drive Emergent In-Context Learning in Transformers.", 2022.5

    • Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen
    • [Paper]
  • "Towards Tracing Factual Knowledge in Language Models Back to the Training Data.", 2022.5

    • Ekin Akyürek, Tolga Bolukbasi, Frederick Liu
    • [Paper]
  • "A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity.", 2023.2

    • Yejin Bang, Samuel Cahyawijaya, Nayeon Lee
    • [Paper]
  • "Hallucinations in Large Multilingual Translation Models.", 2023.3

    • Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
    • [Paper]
  • "Visual Instruction Tuning.", 2023.4

    • Haotian Liu, Chunyuan Li, Qingyang Wu
    • [Paper]
  • "Evaluating Object Hallucination in Large Vision-Language Models.", 2023.5

    • Yifan Li, Yifan Du, Kun Zhou
    • [Paper]
  • "Sources of Hallucination by Large Language Models on Inference Tasks.", 2023.5

    • Nick McKenna, Tianyi Li, Liang Cheng
    • [Paper]
  • "Automatic Evaluation of Attribution by Large Language Models.", 2023.5

    • Xiang Yue, Boshi Wang, Kai Zhang
    • [Paper]
  • "Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.6

    • Fuxiao Liu, Kevin Lin, Linjie Li
    • [Paper]

✨Knowledge Gap

  • "A Survey of Knowledge-enhanced Text Generation.", 2022.1

    • Wenhao Yu, Chenguang Zhu, Zaitang Li
    • [Paper]
  • "Attributed Text Generation via Post-hoc Research and Revision.", 2022.10

    • Luyu Gao, Zhuyun Dai, Panupong Pasupat
    • [Paper]
  • "Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.", 2023.2

    • Hussam Alkaissi, Samy I McFarlane
    • [Paper]
  • "SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models.", 2023.3

    • Potsawee Manakul, Adian Liusie, Mark J. F. Gales
    • [Paper]
  • "Why Does ChatGPT Fall Short in Answering Questions Faithfully?", 2023.4

    • Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang
    • [Paper]
  • "Zero-shot Faithful Factual Error Correction.", 2023.5

    • Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
    • [Paper]
  • "Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment.", 2023.5

    • Shuo Zhang, Liangming Pan, Junzhou Zhao
    • [Paper]
  • "Adaptive Chameleon or Stubborn Sloth: Unraveling the Behavior of Large Language Models in Knowledge Clashes.", 2023.5

    • Jian Xie, Kai Zhang, Jiangjie Chen
    • [Paper]
  • "Evaluating Generative Models for Graph-to-Text Generation.", 2023.7

    • huzhou Yuan, Michael Färber
    • [Paper]
  • "Overthinking the Truth: Understanding how Language Models Process False Demonstrations.", 2023.7

    • Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt
    • [Paper]

✨Optimum Formulation

  • "Improved Natural Language Generation via Loss Truncation.", 2020.5

    • Daniel Kang, Tatsunori Hashimoto
    • [Paper]
  • "The Curious Case of Hallucinations in Neural Machine Translation.", 2021.4

    • Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt
    • [Paper]
  • "Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation.", 2022.12

    • Nuno Miguel Guerreiro, Pierre Colombo, Pablo Piantanida
    • [Paper]
  • "Elastic Weight Removal for Faithful and Abstractive Dialogue Generation.", 2023.3

    • Nico Daheim, Nouha Dziri, Mrinmaya Sachan
    • [Paper]
  • "HistAlign: Improving Context Dependency in Language Generation by Aligning with History.", 2023.5

    • David Wan, Shiyue Zhang, Mohit Bansal
    • [Paper]
  • "How Language Model Hallucinations Can Snowball.", 2023.5

    • Muru Zhang, Ofir Press, William Merrill
    • [Paper]
  • "Improving Language Models via Plug-and-Play Retrieval Feedback.", 2023.5

    • Wenhao Yu, Zhihan Zhang, Zhenwen Liang
    • [Paper]
  • "Teaching Language Models to Hallucinate Less with Synthetic Tasks.", 2023.10

    • Erik Jones, Hamid Palangi, Clarisse Simões
    • [Paper]

🪁Taxonomy of LLMs Hallucination in NLP tasks

✨Machine Translation

  • "Unsupervised Cross-lingual Representation Learning at Scale.", 2020.7

    • Alexis Conneau, Kartikay Khandelwal, Naman Goyal
    • [Paper]
  • "The Curious Case of Hallucinations in Neural Machine Translation.", 2021.4

    • Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt
    • [Paper]
  • "Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation.", 2022.5

    • Tu Vu, Aditya Barua, Brian Lester
    • [Paper]
  • "Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation.", 2022.8

    • Nuno Miguel Guerreiro, Elena Voita, André F. T. Martins
    • [Paper]
  • "Prompting PaLM for Translation: Assessing Strategies and Performance.", 2022.11

    • David Vilar, Markus Freitag, Colin Cherry
    • [Paper]
  • "The unreasonable effectiveness of few-shot learning for machine translation.", 2023.2

    • Xavier Garcia, Yamini Bansal, Colin Cherry
    • [Paper]
  • "How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation.", 2023.2

    • Amr Hendy, Mohamed Abdelrehim, Amr Sharaf
    • [Paper]
  • "Hallucinations in Large Multilingual Translation Models.", 2023.3

    • Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
    • [Paper]
  • "Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM.", 2023.3

    • Rachel Bawden, François Yvon
    • [Paper]
  • "HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation.", 2023.5

    • David Dale, Elena Voita, Janice Lam, Prangthip Hansanti
    • [Paper]
  • "mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations.", 2023.5

    • Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia
    • [Paper]

✨Question and Answer

  • "Hurdles to Progress in Long-form Question Answering.", 2021.6

    • Kalpesh Krishna, Aurko Roy, Mohit Iyyer
    • [Paper]
  • "Entity-Based Knowledge Conflicts in Question Answering.", 2021.9

    • Shayne Longpre, Kartik Perisetla, Anthony Chen
    • [Paper]
  • "TruthfulQA: Measuring How Models Mimic Human Falsehoods.", 2022.5

    • Stephanie Lin, Jacob Hilton, Owain Evans
    • [Paper]
  • "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.", 2023.2

    • Baolin Peng, Michel Galley, Pengcheng He
    • [Paper]
  • "Why Does ChatGPT Fall Short in Answering Questions Faithfully?", 2023.4

    • Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang
    • [Paper]
  • "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering.", 2023.7

    • Vaibhav Adlakha, Parishad BehnamGhader, Xing Han Lu
    • [Paper]
  • "Med-HALT: Medical Domain Hallucination Test for Large Language Models.", 2023.7

    • Logesh Kumar Umapathi, Ankit Pal, Malaikannan Sankarasubbu
    • [Paper]

✨Dialog System

  • "On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?", 2022.7

    • Nouha Dziri, Sivan Milton, Mo Yu
    • [Paper]
  • "Contrastive Learning Reduces Hallucination in Conversations.", 2022.12

    • Weiwei Sun, Zhengliang Shi, Shen Gao
    • [Paper]
  • "Diving Deep into Modes of Fact Hallucinations in Dialogue Systems.", 2022.12

    • Souvik Das, Sougata Saha, Rohini K. Srihari
    • [Paper]
  • "Elastic Weight Removal for Faithful and Abstractive Dialogue Generation.", 2023.3

    • Nico Daheim, Nouha Dziri, Mrinmaya Sachan
    • [Paper]

✨Summarization System

  • "Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization.", 2022.5

    • Meng Cao, Yue Dong, Jackie Chi Kit Cheung
    • [Paper]
  • "Evaluating the Factual Consistency of Large Language Models.", 2022.11

    • Derek Tam, Anisha Mascarenhas, Shiyue Zhang
    • [Paper]
  • "Why is this misleading?": Detecting News Headline Hallucinations with Explanations.", 2023.2

    • Jiaming Shen, Jialu Liu, Dan Finnie
    • [Paper]
  • "Detecting and Mitigating Hallucinations in Multilingual Summarisation.", 2023.5

    • Yifu Qiu, Yftah Ziser, Anna Korhonen
    • [Paper]
  • "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond.", 2023.5

    • Philippe Laban, Wojciech Kryściński, Divyansh Agarwal
    • [Paper]
  • "Evaluating Factual Consistency of Texts with Semantic Role Labeling.", 2023.5

    • Jing Fan, Dennis Aumiller, Michael Gertz
    • [Paper]
  • "Summarization is (Almost) Dead.", 2023.9

    • Xiao Pu, Mingqi Gao, Xiaojun Wan
    • [Paper]

✨Knowledge Graphs with LLMs

  • "GPT-NER: Named Entity Recognition via Large Language Models.", 2023.4

    • Shuhe Wang, Xiaofei Sun, Xiaoya Li
    • [Paper]
  • "LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities.", 2023.5

    • Yuqi Zhu, Xiaohan Wang, Jing Chen
    • [Paper]
  • "KoLA: Carefully Benchmarking World Knowledge of Large Language Models.", 2023.6

    • Jifan Yu, Xiaozhi Wang, Shangqing Tu
    • [Paper]
  • "Evaluating Generative Models for Graph-to-Text Generation.", 2023.7

    • Shuzhou Yuan, Michael Färber
    • [Paper]
  • "Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text.", 2023.8

    • Nandana Mihindukulasooriya, Sanju Tiwari, Carlos F. Enguix
    • [Paper]

✨Cross-modal System

  • "Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning.", 2021.10

    • Ali Furkan Biten, Lluís Gómez, Dimosthenis Karatzas
    • [Paper]
  • "Simple Token-Level Confidence Improves Caption Correctness.", 2023.5

    • Suzanne Petryk, Spencer Whitehead, Joseph E. Gonzalez
    • [Paper]
  • "Evaluating Object Hallucination in Large Vision-Language Models.", 2023.5

    • Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang
    • [Paper]
  • "Album Storytelling with Iterative Story-aware Captioning and Large Language Models.", 2023.5

    • Munan Ning, Yujia Xie, Dongdong Chen
    • [Paper]
  • "Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.6

    • Fuxiao Liu, Kevin Lin, Linjie Li
    • [Paper]
  • "Fact-Checking of AI-Generated Reports.", 2023.7

    • Razi Mahmood, Ge Wang, Mannudeep Kalra
    • [Paper]

✨Others

  • "The Scope of ChatGPT in Software Engineering: A Thorough Investigation.", 2023.5

    • Wei Ma, Shangqing Liu, Wenhan Wang
    • [Paper]
  • "Generating Benchmarks for Factuality Evaluation of Language Models.", 2023.7

    • Dor Muhlgay, Ori Ram, Inbal Magar
    • [Paper]

🔮 Hallucination Detection

✨Inference Classifier

  • "Evaluating the Factual Consistency of Large Language Models Through Summarization.", 2022.11

    • Derek Tam, Anisha Mascarenhas, Shiyue Zhang
    • [Paper]
  • "Why is this misleading?": Detecting News Headline Hallucinations with Explanations.", 2023.2

    • Jiaming Shen, Jialu Liu, Daniel Finnie
    • [Paper]
  • "HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models.", 2023.5

    • Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao
    • [Paper]
  • "Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.6

    • Fuxiao Liu, Kevin Lin, Linjie Li
    • [Paper]
  • "Fact-Checking of AI-Generated Reports.", 2023.7

    • Razi Mahmood, Ge Wang, Mannudeep K. Kalra
    • [Paper]
  • "Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations.", 2023.10

    • Deren Lei, Yaxi Li, Mengya Hu
    • [Paper]

✨Uncertainty Measure

  • "BARTScore: Evaluating Generated Text as Text Generation.", 2021.6

    • Weizhe Yuan, Graham Neubig, Pengfei Liu
    • [Paper]
  • "Contrastive Learning Reduces Hallucination in Conversations.", 2022.12

    • Weiwei Sun, Zhengliang Shi, Shen Gao
    • [Paper]
  • "Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models.", 2023.5

    • Alfonso Amayuelas, Liangming Pan, Wenhu Chen
    • [Paper]
  • "Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models.", 2023.5

    • Peter Hase, Mona Diab, Asli Celikyilmaz
    • [Paper]
  • "Measuring and Modifying Factual Knowledge in Large Language Models.", 2023.6

  • "LLM Calibration and Automatic Hallucination Detection via Pareto Optimal Self-supervision.", 2023.6

    • Theodore Zhao, Mu Wei, J. Samuel Preston
    • [Paper]
  • "A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation.", 2023.7

    • Neeraj Varshney, Wenlin Yao, Hongming Zhang
    • [Paper]

✨Self-Evaluation

  • "Language Models (Mostly) Know What They Know.", 2022.7

    • Saurav Kadavath, Tom Conerly, Amanda Askell
    • [Paper]
  • "SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models.", 2023.3

    • Potsawee Manakul, Adian Liusie, Mark J. F. Gales
    • [Paper]
  • "Do Language Models Know When They’re Hallucinating References?", 2023.5

    • Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
    • [Paper]
  • "Evaluating Object Hallucination in Large Vision-Language Models.", 2023.5

    • Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang
    • [Paper]
  • "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models.", 2023.5

    • Miaoran Li, Baolin Peng, Zhu Zhang
    • [Paper]
  • "LM vs LM: Detecting Factual Errors via Cross Examination.", 2023.5

    • Roi Cohen, May Hamri, Mor Geva
    • [Paper]
  • "Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation.", 2023.5

    • Niels Mündler, Jingxuan He, Slobodan Jenko
    • [Paper]
  • "A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection.", 2023.10

    • Shiping Yang, Renliang Sun, Xiaojun Wan
    • [Paper]

✨Evidence Retrieval

  • "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation.", 2023.5

    • Sewon Min, Kalpesh Krishna, Xinxi Lyu
    • [Paper]
  • "Complex Claim Verification with Evidence Retrieved in the Wild.", 2023.5

    • Jifan Chen, Grace Kim, Aniruddh Sriram
    • [Paper]
  • "Retrieving Supporting Evidence for LLMs Generated Answers.", 2023.6

    • Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
    • [Paper]
  • "FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios.", 2023.7

    • I-Chun Chern, Steffi Chern, Shiqi Chen
    • [Paper]

🪄 Hallucination Correction

✨Parameter Enhancement

  • "Factuality Enhanced Language Models for Open-Ended Text Generation.", 2022.6

    • Nayeon Lee, Wei Ping, Peng Xu
    • [Paper]
  • "Contrastive Learning Reduces Hallucination in Conversations.", 2022.12

    • Weiwei Sun, Zhengliang Shi, Shen Gao
    • [Paper]
  • "Editing Models with Task Arithmetic.", 2023.2

    • Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman
    • [Paper]
  • "Elastic Weight Removal for Faithful and Abstractive Dialogue Generation.", 2023.3

    • Nico Daheim, Nouha Dziri, Mrinmaya Sachan
    • [Paper]
  • "HISTALIGN: Improving Context Dependency in Language Generation by Aligning with History.", 2023.5

    • David Wan, Shiyue Zhang, Mohit Bansal
    • [Paper]
  • "mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations.", 2023.5

    • Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia
    • [Paper]
  • "Trusting Your Evidence: Hallucinate Less with Context-aware Decoding.", 2023.5

    • Weijia Shi, Xiaochuang Han, Mike Lewis
    • [Paper]
  • "PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions.", 2023.5

    • Anthony Chen, Panupong Pasupat, Sameer Singh
    • [Paper]
  • "Augmented Large Language Models with Parametric Knowledge Guiding.", 2023.5

    • Ziyang Luo, Can Xu, Pu Zhao, Xiubo Geng
    • [Paper]
  • "Inference-Time Intervention: Eliciting Truthful Answers from a Language Model.", 2023.6

    • Kenneth Li, Oam Patel, Fernanda Viégas
    • [Paper]
  • "TRAC: Trustworthy Retrieval Augmented Chatbot.", 2023.7

    • Shuo Li, Sangdon Park, Insup Lee
    • [Paper]
  • "EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models", 2023.8

    • Peng Wang, Ningyu Zhang, Xin Xie
    • [Paper]
  • "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models", 2023.9

    • Yung-Sung Chuang, Yujia Xie, Hongyin Luo
    • [Paper]

✨Post-hoc Attribution and Edit Technology

  • "Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding.", 2021.4

    • Nouha Dziri, Andrea Madotto, Osmar Zaïane
    • [Paper]
  • "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.", 2022.1

    • Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma
    • [Paper]
  • "Teaching language models to support answers with verified quotes.", 2022.3

    • Jacob Menick, Maja Trebacz, Vladimir Mikulik
    • [Paper]
  • "ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data.", 2022.5

    • Xiaochuang Han, Yulia Tsvetkov
    • [Paper]
  • "Large Language Models are Zero-Shot Reasoners.", 2022.5

    • Takeshi Kojima, Shixiang Shane Gu, Machel Reid
    • [Paper]
  • "Rethinking with Retrieval: Faithful Large Language Model Inference.", 2023.1

    • Hangfeng He, Hongming Zhang, Dan Roth
    • [Paper]
  • "TRAK: Attributing Model Behavior at Scale.", 2023.3

    • Sung Min Park, Kristian Georgiev, Andrew Ilyas
    • [Paper]
  • "Data Portraits: Recording Foundation Model Training Data.", 2023.3

    • Marc Marone, Benjamin Van Durme
    • [Paper]
  • "Self-Refine: Iterative Refinement with Self-Feedback.", 2023.3

    • Aman Madaan, Niket Tandon, Prakhar Gupta
    • [Paper]
  • "Reflexion: an autonomous agent with dynamic memory and self-reflection.", 2023.3

    • Noah Shinn, Beck Labash, Ashwin Gopinath
    • [Paper]
  • "According to ..." Prompting Language Models Improves Quoting from Pre-Training Data.", 2023.5

    • Orion Weller, Marc Marone, Nathaniel Weir
    • [Paper]
  • "Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework.", 2023.5

    • Ruochen Zhao, Xingxuan Li, Shafiq Joty
    • [Paper]
  • "Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.9

    • Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu
    • [Paper]
  • "Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations.", 2023.10

    • Deren Lei, Yaxi Li, Mengya Hu
    • [Paper]

✨Utilizing Programming Languages

  • "PAL: Program-aided Language Models.", 2022.11

    • Luyu Gao, Aman Madaan, Shuyan Zhou
    • [Paper]
  • "Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks.", 2022.11

    • Wenhu Chen, Xueguang Ma, Xinyi Wang
    • [Paper]
  • "Teaching Algorithmic Reasoning via In-context Learning.", 2022.11

    • Hattie Zhou, Azade Nova, Hugo Larochelle
    • [Paper]
  • "Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification.", 2023.8

    • Aojun Zhou, Ke Wang, Zimu Lu
    • [Paper]

✨Leverage External Knowledge

  • "Improving Language Models by Retrieving from Trillions of Tokens.", 2021.12

    • Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann
    • [Paper]
  • "Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions., 2022.12

    • Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot
    • [Paper]
  • "When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories.", 2022.12

    • Alex Mallen, Akari Asai, Victor Zhong
    • [Paper]
  • "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.", 2023.2

    • Baolin Peng, Michel Galley, Pengcheng He
    • [Paper]
  • "In-Context Retrieval-Augmented Language Models.", 2023.2

    • Ori Ram, Yoav Levine, Itay Dalmedigos
    • [Paper]
  • "cTBL: Augmenting Large Language Models for Conversational Tables.", 2023.3

    • Anirudh S. Sundar, Larry Heck
    • [Paper]
  • "GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information.", 2023.4

    • Qiao Jin, Yifan Yang, Qingyu Chen
    • [Paper]
  • "Active Retrieval Augmented Generation.", 2023.5

    • Zhengbao Jiang, Frank F. Xu, Luyu Gao
    • [Paper]
  • "Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases.", 2023.5

    • Xingxuan Li, Ruochen Zhao, Yew Ken Chia
    • [Paper]
  • "Gorilla: Large Language Model Connected with Massive APIs.", 2023.5

    • Shishir G. Patil, Tianjun Zhang, Xin Wang
    • [Paper]
  • "RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit.", 2023.6

    • Jiongnan Liu, Jiajie Jin, Zihan Wang
    • [Paper]
  • "User-Controlled Knowledge Fusion in Large Language Models: Balancing Creativity and Hallucination.", 2023.7

  • "KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases.", 2023.8

    • Xintao Wang, Qianwen Yang, Yongting Qiu
    • [Paper]

✨Assessment Feedback

  • "Learning to summarize with human feedback.", 2020.12

    • Nisan Stiennon, Long Ouyang, Jeffrey Wu
    • [Paper]
  • "BRIO: Bringing Order to Abstractive Summarization.", 2022.3

    • Yixin Liu, Pengfei Liu, Dragomir R. Radev
    • [Paper]
  • "Language Models (Mostly) Know What They Know.", 2022.7

    • Saurav Kadavath, Tom Conerly, Amanda Askell
    • [Paper]
  • "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.", 2023.2

    • Baolin Peng, Michel Galley, Pengcheng He
    • [Paper]
  • "Chain of Hindsight Aligns Language Models with Feedback.", 2023.2

    • Hao Liu, Carmelo Sferrazza, Pieter Abbeel
    • [Paper]
  • "Zero-shot Faithful Factual Error Correction.", 2023.5

    • Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
    • [Paper]
  • "CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing.", 2023.5

    • Zhibin Gou, Zhihong Shao, Yeyun Gong
    • [Paper]
  • "Album Storytelling with Iterative Story-aware Captioning and Large Language Models.", 2023.5

    • Munan Ning, Yujia Xie, Dongdong Chen
    • [Paper]
  • "How Language Model Hallucinations Can Snowball.", 2023.5

    • Muru Zhang, Ofir Press, William Merrill
    • [Paper]
  • "Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment.", 2023.5

    • Shuo Zhang, Liangming Pan, Junzhou Zhao
    • [Paper]
  • "Improving Language Models via Plug-and-Play Retrieval Feedback.", 2023.5

    • Wenhao Yu, Zhihan Zhang, Zhenwen Liang
    • [Paper]
  • "PaD: Program-aided Distillation Specializes Large Models in Reasoning.", 2023.5

    • Xuekai Zhu, Biqing Qi, Kaiyan Zhang
    • [Paper]
  • "Enabling Large Language Models to Generate Text with Citations.", 2023.5

    • Tianyu Gao, Howard Yen, Jiatong Yu
    • [Paper]
  • "Do Language Models Know When They’re Hallucinating References?", 2023.5

    • Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
    • [Paper]
  • "Improving Factuality of Abstractive Summarization via Contrastive Reward Learning.", 2023.7

    • I-Chun Chern, Zhiruo Wang, Sanjan Das
    • [Paper]
  • "Towards Mitigating Hallucination in Large Language Models via Self-Reflection.", 2023.10

    • Ziwei Ji, Tiezheng Yu, Yan Xu
    • [Paper]

✨Mindset Society

  • "Hallucinations in Large Multilingual Translation Models.", 2023.3

    • Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
    • [Paper]
  • "Improving Factuality and Reasoning in Language Models through Multiagent Debate.", 2023.5

    • Yilun Du, Shuang Li, Antonio Torralba
    • [Paper]
  • "Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate.", 2023.5

    • Tian Liang, Zhiwei He, Wenxiang Jiao
    • [Paper]
  • "Examining the Inter-Consistency of Large Language Models: An In-depth Analysis via Debate.", 2023.5

    • Kai Xiong, Xiao Ding, Yixin Cao
    • [Paper]
  • "LM vs LM: Detecting Factual Errors via Cross Examination.", 2023.5

    • Roi Cohen, May Hamri, Mor Geva
    • [Paper]
  • "PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations.", 2023.7

    • Ruosen Li, Teerth Patel, Xinya Du
    • [Paper]
  • "Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration.", 2023.7

    • Zhenhailong Wang, Shaoguang Mao, Wenshan Wu
    • [Paper]

🌟 TIPS

If you find this repository useful to your research or work, it is really appreciate to star this repository.