🔮 Hallucination Detection

Uncritical trust in LLMs can give rise to a phenomenon Cognitive Mirage, leading to misguided decision-making and a cascade of unintended consequences. To effectively control the risk of hallucinations, we summarize recent progress in hallucination theories and solutions in this paper. We propose to organize relevant work by a comprehensive survey.

[:bell: News! :bell: ] We have released a new survey paper:"Cognitive Mirage: A Review of Hallucinations in Large Language Models" based on this repository, with a perspective of Hallucinations in LLMs! We are looking forward to any comments or discussions on this topic :)

🕵️ Introduction

As large language models continue to develop in the field of AI, text generation systems are susceptible to a worrisome phenomenon known as hallucination. In this study, we summarize recent compelling insights into hallucinations in LLMs. We present a novel taxonomy of hallucinations from various text generation tasks, thus provide theoretical insights, detection methods and improvement approaches. Based on this, future research directions are proposed. Our contribution are threefold: (1) We provide a detailed and complete taxonomy for hallucinations appearing in text generation tasks; (2) We provide theoretical analyses of hallucinations in LLMs and provide existing detection and improvement methods; (3) We propose several research directions that can be developed in the future. As hallucinations garner significant attention from the community, we will maintain updates on relevant research progress.

🏆 A Timeline of LLMs

LLM Name	Title	Authors	Publication Date
T5	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.	Colin Raffel, Noam Shazeer, Adam Roberts	2019.10
GPT-3	Language Models are Few-Shot Learners.	Tom B. Brown, Benjamin Mann, Nick Ryder	2020.12
mT5	mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer.	Linting Xue, Noah Constant, Adam Roberts	2021.3
Codex	Evaluating Large Language Models Trained on Code.	Mark Chen, Jerry Tworek, Heewoo Jun	2021.7
FLAN	Finetuned Language Models are Zero-Shot Learners.	Jason Wei, Maarten Bosma, Vincent Y. Zhao	2021.9
WebGPT	WebGPT: Browser-assisted question-answering with human feedback.	Reiichiro Nakano, Jacob Hilton, Suchir Balaji	2021.12
InstructGPT	Training language models to follow instructions with human feedback.	Long Ouyang, Jeffrey Wu, Xu Jiang	2022.3
CodeGen	CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis.	Erik Nijkamp, Bo Pang, Hiroaki Hayashi	2022.3
Claude	Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback.	Yuntao Bai, Andy Jones, Kamal Ndousse	2022.4
PaLM	PaLM: Scaling Language Modeling with Pathways.	Aakanksha Chowdhery, Sharan Narang, Jacob Devlin	2022.4
OPT	OPT: Open Pre-trained Transformer Language Models.	Susan Zhang, Stephen Roller, Naman Goyal	2022.5
Super-NaturalInstructions	ESuper-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks.	Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi	2022.9
GLM	GLM-130B: An Open Bilingual Pre-trained Model.	Aohan Zeng, Xiao Liu, Zhengxiao Du	2022.10
BLOOM	BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.	Teven Le Scao, Angela Fan, Christopher Akiki	2022.11
LLaMA	LLaMA: Open and Efficient Foundation Language Models.	Hugo Touvron, Thibaut Lavril, Gautier Izacard	2023.2
Alpaca	Alpaca: A Strong, Replicable Instruction-Following Model.	Rohan Taori, Ishaan Gulrajani, Tianyi Zhang	2023.3
GPT-4	GPT-4 Technical Report.	OpenAI	2023.3
WizardLM	WizardLM: Empowering Large Language Models to Follow Complex Instructions.	Can Xu, Qingfeng Sun, Kai Zheng	2023.4
Vicuna	Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality.	The Vicuna Team	2023.5
ChatGLM	ChatGLM.	Wisdom and Clear Speech Team	2023.6
Llama2	Llama 2: Open Foundation and Fine-Tuned Chat Models.	Hugo Touvron, Louis Martin, Kevin Stone	2023.7

🏳‍🌈 Definition of Hallucination

"Truthful AI: Developing and governing AI that does not lie.", 2021.10
- Owain Evans, Owen Cotton-Barratt, Lukas Finnveden
- [Paper]
"Survey of Hallucination in Natural Language Generation", 2022.2
- Ziwei Ji, Nayeon Lee, Rita Frieske
- [Paper]
"Context-faithful Prompting for Large Language Models.", 2023.3
- Wenxuan Zhou, Sheng Zhang, Hoifung Poon
- [Paper]
"Do Language Models Know When They’re Hallucinating References?", 2023.5
- Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
- [Paper]

🎉 Mechanism Analysis

✨Data Attribution

"Data Distributional Properties Drive Emergent In-Context Learning in Transformers.", 2022.5
- Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen
- [Paper]
"Towards Tracing Factual Knowledge in Language Models Back to the Training Data.", 2022.5
- Ekin Akyürek, Tolga Bolukbasi, Frederick Liu
- [Paper]
"A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity.", 2023.2
- Yejin Bang, Samuel Cahyawijaya, Nayeon Lee
- [Paper]
"Hallucinations in Large Multilingual Translation Models.", 2023.3
- Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
- [Paper]
"Visual Instruction Tuning.", 2023.4
- Haotian Liu, Chunyuan Li, Qingyang Wu
- [Paper]
"Evaluating Object Hallucination in Large Vision-Language Models.", 2023.5
- Yifan Li, Yifan Du, Kun Zhou
- [Paper]
"Sources of Hallucination by Large Language Models on Inference Tasks.", 2023.5
- Nick McKenna, Tianyi Li, Liang Cheng
- [Paper]
"Automatic Evaluation of Attribution by Large Language Models.", 2023.5
- Xiang Yue, Boshi Wang, Kai Zhang
- [Paper]
"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.6
- Fuxiao Liu, Kevin Lin, Linjie Li
- [Paper]

✨Knowledge Gap

"A Survey of Knowledge-enhanced Text Generation.", 2022.1
- Wenhao Yu, Chenguang Zhu, Zaitang Li
- [Paper]
"Attributed Text Generation via Post-hoc Research and Revision.", 2022.10
- Luyu Gao, Zhuyun Dai, Panupong Pasupat
- [Paper]
"Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.", 2023.2
- Hussam Alkaissi, Samy I McFarlane
- [Paper]
"SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models.", 2023.3
- Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- [Paper]
"Why Does ChatGPT Fall Short in Answering Questions Faithfully?", 2023.4
- Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang
- [Paper]
"Zero-shot Faithful Factual Error Correction.", 2023.5
- Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
- [Paper]
"Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment.", 2023.5
- Shuo Zhang, Liangming Pan, Junzhou Zhao
- [Paper]
"Adaptive Chameleon or Stubborn Sloth: Unraveling the Behavior of Large Language Models in Knowledge Clashes.", 2023.5
- Jian Xie, Kai Zhang, Jiangjie Chen
- [Paper]
"Evaluating Generative Models for Graph-to-Text Generation.", 2023.7
- huzhou Yuan, Michael Färber
- [Paper]
"Overthinking the Truth: Understanding how Language Models Process False Demonstrations.", 2023.7
- Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt
- [Paper]

✨Optimum Formulation

"Improved Natural Language Generation via Loss Truncation.", 2020.5
- Daniel Kang, Tatsunori Hashimoto
- [Paper]
"The Curious Case of Hallucinations in Neural Machine Translation.", 2021.4
- Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt
- [Paper]
"Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation.", 2022.12
- Nuno Miguel Guerreiro, Pierre Colombo, Pablo Piantanida
- [Paper]
"Elastic Weight Removal for Faithful and Abstractive Dialogue Generation.", 2023.3
- Nico Daheim, Nouha Dziri, Mrinmaya Sachan
- [Paper]
"HistAlign: Improving Context Dependency in Language Generation by Aligning with History.", 2023.5
- David Wan, Shiyue Zhang, Mohit Bansal
- [Paper]
"How Language Model Hallucinations Can Snowball.", 2023.5
- Muru Zhang, Ofir Press, William Merrill
- [Paper]
"Improving Language Models via Plug-and-Play Retrieval Feedback.", 2023.5
- Wenhao Yu, Zhihan Zhang, Zhenwen Liang
- [Paper]
"Teaching Language Models to Hallucinate Less with Synthetic Tasks.", 2023.10
- Erik Jones, Hamid Palangi, Clarisse Simões
- [Paper]

🪁Taxonomy of LLMs Hallucination in NLP tasks

✨Machine Translation

"Unsupervised Cross-lingual Representation Learning at Scale.", 2020.7
- Alexis Conneau, Kartikay Khandelwal, Naman Goyal
- [Paper]
"The Curious Case of Hallucinations in Neural Machine Translation.", 2021.4
- Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt
- [Paper]
"Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation.", 2022.5
- Tu Vu, Aditya Barua, Brian Lester
- [Paper]
"Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation.", 2022.8
- Nuno Miguel Guerreiro, Elena Voita, André F. T. Martins
- [Paper]
"Prompting PaLM for Translation: Assessing Strategies and Performance.", 2022.11
- David Vilar, Markus Freitag, Colin Cherry
- [Paper]
"The unreasonable effectiveness of few-shot learning for machine translation.", 2023.2
- Xavier Garcia, Yamini Bansal, Colin Cherry
- [Paper]
"How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation.", 2023.2
- Amr Hendy, Mohamed Abdelrehim, Amr Sharaf
- [Paper]
"Hallucinations in Large Multilingual Translation Models.", 2023.3
- Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
- [Paper]
"Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM.", 2023.3
- Rachel Bawden, François Yvon
- [Paper]
"HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation.", 2023.5
- David Dale, Elena Voita, Janice Lam, Prangthip Hansanti
- [Paper]
"mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations.", 2023.5
- Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia
- [Paper]

✨Question and Answer

"Hurdles to Progress in Long-form Question Answering.", 2021.6
- Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- [Paper]
"Entity-Based Knowledge Conflicts in Question Answering.", 2021.9
- Shayne Longpre, Kartik Perisetla, Anthony Chen
- [Paper]
"TruthfulQA: Measuring How Models Mimic Human Falsehoods.", 2022.5
- Stephanie Lin, Jacob Hilton, Owain Evans
- [Paper]
"Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.", 2023.2
- Baolin Peng, Michel Galley, Pengcheng He
- [Paper]
"Why Does ChatGPT Fall Short in Answering Questions Faithfully?", 2023.4
- Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang
- [Paper]
"Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering.", 2023.7
- Vaibhav Adlakha, Parishad BehnamGhader, Xing Han Lu
- [Paper]
"Med-HALT: Medical Domain Hallucination Test for Large Language Models.", 2023.7
- Logesh Kumar Umapathi, Ankit Pal, Malaikannan Sankarasubbu
- [Paper]

✨Dialog System

"On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?", 2022.7
- Nouha Dziri, Sivan Milton, Mo Yu
- [Paper]
"Contrastive Learning Reduces Hallucination in Conversations.", 2022.12
- Weiwei Sun, Zhengliang Shi, Shen Gao
- [Paper]
"Diving Deep into Modes of Fact Hallucinations in Dialogue Systems.", 2022.12
- Souvik Das, Sougata Saha, Rohini K. Srihari
- [Paper]
"Elastic Weight Removal for Faithful and Abstractive Dialogue Generation.", 2023.3
- Nico Daheim, Nouha Dziri, Mrinmaya Sachan
- [Paper]

✨Summarization System

"Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization.", 2022.5
- Meng Cao, Yue Dong, Jackie Chi Kit Cheung
- [Paper]
"Evaluating the Factual Consistency of Large Language Models.", 2022.11
- Derek Tam, Anisha Mascarenhas, Shiyue Zhang
- [Paper]
"Why is this misleading?": Detecting News Headline Hallucinations with Explanations.", 2023.2
- Jiaming Shen, Jialu Liu, Dan Finnie
- [Paper]
"Detecting and Mitigating Hallucinations in Multilingual Summarisation.", 2023.5
- Yifu Qiu, Yftah Ziser, Anna Korhonen
- [Paper]
"LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond.", 2023.5
- Philippe Laban, Wojciech Kryściński, Divyansh Agarwal
- [Paper]
"Evaluating Factual Consistency of Texts with Semantic Role Labeling.", 2023.5
- Jing Fan, Dennis Aumiller, Michael Gertz
- [Paper]
"Summarization is (Almost) Dead.", 2023.9
- Xiao Pu, Mingqi Gao, Xiaojun Wan
- [Paper]

✨Knowledge Graphs with LLMs

"GPT-NER: Named Entity Recognition via Large Language Models.", 2023.4
- Shuhe Wang, Xiaofei Sun, Xiaoya Li
- [Paper]
"LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities.", 2023.5
- Yuqi Zhu, Xiaohan Wang, Jing Chen
- [Paper]
"KoLA: Carefully Benchmarking World Knowledge of Large Language Models.", 2023.6
- Jifan Yu, Xiaozhi Wang, Shangqing Tu
- [Paper]
"Evaluating Generative Models for Graph-to-Text Generation.", 2023.7
- Shuzhou Yuan, Michael Färber
- [Paper]
"Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text.", 2023.8
- Nandana Mihindukulasooriya, Sanju Tiwari, Carlos F. Enguix
- [Paper]

✨Cross-modal System

"Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning.", 2021.10
- Ali Furkan Biten, Lluís Gómez, Dimosthenis Karatzas
- [Paper]
"Simple Token-Level Confidence Improves Caption Correctness.", 2023.5
- Suzanne Petryk, Spencer Whitehead, Joseph E. Gonzalez
- [Paper]
"Evaluating Object Hallucination in Large Vision-Language Models.", 2023.5
- Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang
- [Paper]
"Album Storytelling with Iterative Story-aware Captioning and Large Language Models.", 2023.5
- Munan Ning, Yujia Xie, Dongdong Chen
- [Paper]
"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.6
- Fuxiao Liu, Kevin Lin, Linjie Li
- [Paper]
"Fact-Checking of AI-Generated Reports.", 2023.7
- Razi Mahmood, Ge Wang, Mannudeep Kalra
- [Paper]

✨Others

"The Scope of ChatGPT in Software Engineering: A Thorough Investigation.", 2023.5
- Wei Ma, Shangqing Liu, Wenhan Wang
- [Paper]
"Generating Benchmarks for Factuality Evaluation of Language Models.", 2023.7
- Dor Muhlgay, Ori Ram, Inbal Magar
- [Paper]

🔮 Hallucination Detection

✨Inference Classifier

"Evaluating the Factual Consistency of Large Language Models Through Summarization.", 2022.11
- Derek Tam, Anisha Mascarenhas, Shiyue Zhang
- [Paper]
"Why is this misleading?": Detecting News Headline Hallucinations with Explanations.", 2023.2
- Jiaming Shen, Jialu Liu, Daniel Finnie
- [Paper]
"HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models.", 2023.5
- Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao
- [Paper]
"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.6
- Fuxiao Liu, Kevin Lin, Linjie Li
- [Paper]
"Fact-Checking of AI-Generated Reports.", 2023.7
- Razi Mahmood, Ge Wang, Mannudeep K. Kalra
- [Paper]
"Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations.", 2023.10
- Deren Lei, Yaxi Li, Mengya Hu
- [Paper]

✨Uncertainty Measure

"BARTScore: Evaluating Generated Text as Text Generation.", 2021.6
- Weizhe Yuan, Graham Neubig, Pengfei Liu
- [Paper]
"Contrastive Learning Reduces Hallucination in Conversations.", 2022.12
- Weiwei Sun, Zhengliang Shi, Shen Gao
- [Paper]
"Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models.", 2023.5
- Alfonso Amayuelas, Liangming Pan, Wenhu Chen
- [Paper]
"Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models.", 2023.5
- Peter Hase, Mona Diab, Asli Celikyilmaz
- [Paper]
"Measuring and Modifying Factual Knowledge in Large Language Models.", 2023.6
- Pouya Pezeshkpour
- [Paper]
"LLM Calibration and Automatic Hallucination Detection via Pareto Optimal Self-supervision.", 2023.6
- Theodore Zhao, Mu Wei, J. Samuel Preston
- [Paper]
"A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation.", 2023.7
- Neeraj Varshney, Wenlin Yao, Hongming Zhang
- [Paper]

✨Self-Evaluation

"Language Models (Mostly) Know What They Know.", 2022.7
- Saurav Kadavath, Tom Conerly, Amanda Askell
- [Paper]
"SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models.", 2023.3
- Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- [Paper]
"Do Language Models Know When They’re Hallucinating References?", 2023.5
- Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
- [Paper]
"Evaluating Object Hallucination in Large Vision-Language Models.", 2023.5
- Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang
- [Paper]
"Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models.", 2023.5
- Miaoran Li, Baolin Peng, Zhu Zhang
- [Paper]
"LM vs LM: Detecting Factual Errors via Cross Examination.", 2023.5
- Roi Cohen, May Hamri, Mor Geva
- [Paper]
"Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation.", 2023.5
- Niels Mündler, Jingxuan He, Slobodan Jenko
- [Paper]
"A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection.", 2023.10
- Shiping Yang, Renliang Sun, Xiaojun Wan
- [Paper]

✨Evidence Retrieval

"FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation.", 2023.5
- Sewon Min, Kalpesh Krishna, Xinxi Lyu
- [Paper]
"Complex Claim Verification with Evidence Retrieved in the Wild.", 2023.5
- Jifan Chen, Grace Kim, Aniruddh Sriram
- [Paper]
"Retrieving Supporting Evidence for LLMs Generated Answers.", 2023.6
- Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
- [Paper]
"FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios.", 2023.7
- I-Chun Chern, Steffi Chern, Shiqi Chen
- [Paper]

🪄 Hallucination Correction

✨Parameter Enhancement

"Factuality Enhanced Language Models for Open-Ended Text Generation.", 2022.6
- Nayeon Lee, Wei Ping, Peng Xu
- [Paper]
"Contrastive Learning Reduces Hallucination in Conversations.", 2022.12
- Weiwei Sun, Zhengliang Shi, Shen Gao
- [Paper]
"Editing Models with Task Arithmetic.", 2023.2
- Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman
- [Paper]
"Elastic Weight Removal for Faithful and Abstractive Dialogue Generation.", 2023.3
- Nico Daheim, Nouha Dziri, Mrinmaya Sachan
- [Paper]
"HISTALIGN: Improving Context Dependency in Language Generation by Aligning with History.", 2023.5
- David Wan, Shiyue Zhang, Mohit Bansal
- [Paper]
"mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations.", 2023.5
- Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia
- [Paper]
"Trusting Your Evidence: Hallucinate Less with Context-aware Decoding.", 2023.5
- Weijia Shi, Xiaochuang Han, Mike Lewis
- [Paper]
"PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions.", 2023.5
- Anthony Chen, Panupong Pasupat, Sameer Singh
- [Paper]
"Augmented Large Language Models with Parametric Knowledge Guiding.", 2023.5
- Ziyang Luo, Can Xu, Pu Zhao, Xiubo Geng
- [Paper]
"Inference-Time Intervention: Eliciting Truthful Answers from a Language Model.", 2023.6
- Kenneth Li, Oam Patel, Fernanda Viégas
- [Paper]
"TRAC: Trustworthy Retrieval Augmented Chatbot.", 2023.7
- Shuo Li, Sangdon Park, Insup Lee
- [Paper]
"EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models", 2023.8
- Peng Wang, Ningyu Zhang, Xin Xie
- [Paper]
"DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models", 2023.9
- Yung-Sung Chuang, Yujia Xie, Hongyin Luo
- [Paper]

✨Post-hoc Attribution and Edit Technology

"Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding.", 2021.4
- Nouha Dziri, Andrea Madotto, Osmar Zaïane
- [Paper]
"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.", 2022.1
- Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma
- [Paper]
"Teaching language models to support answers with verified quotes.", 2022.3
- Jacob Menick, Maja Trebacz, Vladimir Mikulik
- [Paper]
"ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data.", 2022.5
- Xiaochuang Han, Yulia Tsvetkov
- [Paper]
"Large Language Models are Zero-Shot Reasoners.", 2022.5
- Takeshi Kojima, Shixiang Shane Gu, Machel Reid
- [Paper]
"Rethinking with Retrieval: Faithful Large Language Model Inference.", 2023.1
- Hangfeng He, Hongming Zhang, Dan Roth
- [Paper]
"TRAK: Attributing Model Behavior at Scale.", 2023.3
- Sung Min Park, Kristian Georgiev, Andrew Ilyas
- [Paper]
"Data Portraits: Recording Foundation Model Training Data.", 2023.3
- Marc Marone, Benjamin Van Durme
- [Paper]
"Self-Refine: Iterative Refinement with Self-Feedback.", 2023.3
- Aman Madaan, Niket Tandon, Prakhar Gupta
- [Paper]
"Reflexion: an autonomous agent with dynamic memory and self-reflection.", 2023.3
- Noah Shinn, Beck Labash, Ashwin Gopinath
- [Paper]
"According to ..." Prompting Language Models Improves Quoting from Pre-Training Data.", 2023.5
- Orion Weller, Marc Marone, Nathaniel Weir
- [Paper]
"Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework.", 2023.5
- Ruochen Zhao, Xingxuan Li, Shafiq Joty
- [Paper]
"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.", 2023.9
- Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu
- [Paper]
"Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations.", 2023.10
- Deren Lei, Yaxi Li, Mengya Hu
- [Paper]

✨Utilizing Programming Languages

"PAL: Program-aided Language Models.", 2022.11
- Luyu Gao, Aman Madaan, Shuyan Zhou
- [Paper]
"Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks.", 2022.11
- Wenhu Chen, Xueguang Ma, Xinyi Wang
- [Paper]
"Teaching Algorithmic Reasoning via In-context Learning.", 2022.11
- Hattie Zhou, Azade Nova, Hugo Larochelle
- [Paper]
"Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification.", 2023.8
- Aojun Zhou, Ke Wang, Zimu Lu
- [Paper]

✨Leverage External Knowledge

"Improving Language Models by Retrieving from Trillions of Tokens.", 2021.12
- Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann
- [Paper]
"Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions., 2022.12
- Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot
- [Paper]
"When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories.", 2022.12
- Alex Mallen, Akari Asai, Victor Zhong
- [Paper]
"Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.", 2023.2
- Baolin Peng, Michel Galley, Pengcheng He
- [Paper]
"In-Context Retrieval-Augmented Language Models.", 2023.2
- Ori Ram, Yoav Levine, Itay Dalmedigos
- [Paper]
"cTBL: Augmenting Large Language Models for Conversational Tables.", 2023.3
- Anirudh S. Sundar, Larry Heck
- [Paper]
"GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information.", 2023.4
- Qiao Jin, Yifan Yang, Qingyu Chen
- [Paper]
"Active Retrieval Augmented Generation.", 2023.5
- Zhengbao Jiang, Frank F. Xu, Luyu Gao
- [Paper]
"Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases.", 2023.5
- Xingxuan Li, Ruochen Zhao, Yew Ken Chia
- [Paper]
"Gorilla: Large Language Model Connected with Massive APIs.", 2023.5
- Shishir G. Patil, Tianjun Zhang, Xin Wang
- [Paper]
"RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit.", 2023.6
- Jiongnan Liu, Jiajie Jin, Zihan Wang
- [Paper]
"User-Controlled Knowledge Fusion in Large Language Models: Balancing Creativity and Hallucination.", 2023.7
- SChen Zhang
- [Paper]
"KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases.", 2023.8
- Xintao Wang, Qianwen Yang, Yongting Qiu
- [Paper]

✨Assessment Feedback

"Learning to summarize with human feedback.", 2020.12
- Nisan Stiennon, Long Ouyang, Jeffrey Wu
- [Paper]
"BRIO: Bringing Order to Abstractive Summarization.", 2022.3
- Yixin Liu, Pengfei Liu, Dragomir R. Radev
- [Paper]
"Language Models (Mostly) Know What They Know.", 2022.7
- Saurav Kadavath, Tom Conerly, Amanda Askell
- [Paper]
"Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.", 2023.2
- Baolin Peng, Michel Galley, Pengcheng He
- [Paper]
"Chain of Hindsight Aligns Language Models with Feedback.", 2023.2
- Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- [Paper]
"Zero-shot Faithful Factual Error Correction.", 2023.5
- Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
- [Paper]
"CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing.", 2023.5
- Zhibin Gou, Zhihong Shao, Yeyun Gong
- [Paper]
"Album Storytelling with Iterative Story-aware Captioning and Large Language Models.", 2023.5
- Munan Ning, Yujia Xie, Dongdong Chen
- [Paper]
"How Language Model Hallucinations Can Snowball.", 2023.5
- Muru Zhang, Ofir Press, William Merrill
- [Paper]
"Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment.", 2023.5
- Shuo Zhang, Liangming Pan, Junzhou Zhao
- [Paper]
"Improving Language Models via Plug-and-Play Retrieval Feedback.", 2023.5
- Wenhao Yu, Zhihan Zhang, Zhenwen Liang
- [Paper]
"PaD: Program-aided Distillation Specializes Large Models in Reasoning.", 2023.5
- Xuekai Zhu, Biqing Qi, Kaiyan Zhang
- [Paper]
"Enabling Large Language Models to Generate Text with Citations.", 2023.5
- Tianyu Gao, Howard Yen, Jiatong Yu
- [Paper]
"Do Language Models Know When They’re Hallucinating References?", 2023.5
- Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
- [Paper]
"Improving Factuality of Abstractive Summarization via Contrastive Reward Learning.", 2023.7
- I-Chun Chern, Zhiruo Wang, Sanjan Das
- [Paper]
"Towards Mitigating Hallucination in Large Language Models via Self-Reflection.", 2023.10
- Ziwei Ji, Tiezheng Yu, Yan Xu
- [Paper]

✨Mindset Society

"Hallucinations in Large Multilingual Translation Models.", 2023.3
- Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
- [Paper]
"Improving Factuality and Reasoning in Language Models through Multiagent Debate.", 2023.5
- Yilun Du, Shuang Li, Antonio Torralba
- [Paper]
"Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate.", 2023.5
- Tian Liang, Zhiwei He, Wenxiang Jiao
- [Paper]
"Examining the Inter-Consistency of Large Language Models: An In-depth Analysis via Debate.", 2023.5
- Kai Xiong, Xiao Ding, Yixin Cao
- [Paper]
"LM vs LM: Detecting Factual Errors via Cross Examination.", 2023.5
- Roi Cohen, May Hamri, Mor Geva
- [Paper]
"PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations.", 2023.7
- Ruosen Li, Teerth Patel, Xinya Du
- [Paper]
"Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration.", 2023.7
- Zhenhailong Wang, Shaoguang Mao, Wenshan Wu
- [Paper]

🌟 TIPS

If you find this repository useful to your research or work, it is really appreciate to star this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
pictures		pictures
LICENSE		LICENSE
README.md		README.md

License

hongbinye/Cognitive-Mirage-Hallucinations-in-LLMs

Folders and files

Latest commit

History

Repository files navigation

🕵️ Introduction

🏆 A Timeline of LLMs

🏳‍🌈 Definition of Hallucination

🎉 Mechanism Analysis

✨Data Attribution

✨Knowledge Gap

✨Optimum Formulation

🪁Taxonomy of LLMs Hallucination in NLP tasks

✨Machine Translation

✨Question and Answer

✨Dialog System

✨Summarization System

✨Knowledge Graphs with LLMs

✨Cross-modal System

✨Others

🔮 Hallucination Detection

✨Inference Classifier

✨Uncertainty Measure

✨Self-Evaluation

✨Evidence Retrieval

🪄 Hallucination Correction

✨Parameter Enhancement

✨Post-hoc Attribution and Edit Technology

✨Utilizing Programming Languages

✨Leverage External Knowledge

✨Assessment Feedback

✨Mindset Society

🌟 TIPS

About

Topics

Resources

License

Stars

Watchers

Forks