hallucination

Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.

nlp ai evaluation ml pytorch judge feedback-collection sota custom-dataset finetuning hallucination llm llm-evaluation hallucination-detection phi-3

Updated May 31, 2024
Jupyter Notebook

zjunlp / NLPCC2024_RegulatingLLM

Star

[NLPCC 2024] Shared Task 10: Regulating Large Language Models

natural-language-processing artificial-intelligence multimodal hallucination large-language-models detoxification regulating-mechanisms nlpcc-2024

Updated May 31, 2024

jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness

Star

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

reliability calibration safety awesome-list uncertainty-quantification uncertainty-estimation robustness hallucination gpt-3 gpt-4 in-context-learning large-language-models prompt-engineering prompting llms chain-of-thought chatgpt

Updated May 30, 2024

IAAR-Shanghai / UHGEval

Star

[ACL 2024] Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation

Updated May 27, 2024
Python

vr25 / lrec-coling-hallucination-tutorial

Star

LREC-COLING 2024 Tutorial

hallucination llms

Updated May 25, 2024
JavaScript

weijiaheng / CHALE

Star

Controlled HALlucination-Evaluation (CHALE) Question-Answering Dataset

hallucination question-answer llm

Updated May 22, 2024
Python

robertbenson / openai_assistant_code_interpreter

Star

openai assistant using code interpreter

assistant hallucination openai-assistant-api

Updated May 20, 2024
Python

zjunlp / FactCHD

Star

[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

benchmark natural-language-processing knowledge dataset factual hallucination large-language-models factchd

Updated Apr 28, 2024
Python

CrackedResearcher / LLMVerify

Star

Verify outputs generated by LLMs backed with real time data

hallucination streamlit gpt-4 llm generative-ai claude-3

Updated Apr 22, 2024
Python

yfzhang114 / LLaVA-Align

Star

This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.

hallucination debiasing large-vision-language-models

Updated Mar 28, 2024
Python

qqplot / dcpmi

Star

[NAACL24] Official Implementation of Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information

hallucination naacl2024

Updated Mar 27, 2024
Python

ictnlp / TruthX

Star

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"

safety llama representation language-model mistral explainable-ai hallucination baichuan hallucinations gpt-4 truthfulness llm llms chatgpt chatglm llm-inference llama2 llama3

Updated Mar 26, 2024
Python

xieyuquanxx / awesome-Large-MultiModal-Hallucination

Star

😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.

multi-modal multimodal lmm hallucination

Updated Mar 23, 2024

anlp-team / LTI_Neural_Navigator

Star

"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang

nlp machine-learning web-crawler llama datasets dataset-generation system-design hallucination rag large-language-models llm

Updated Mar 18, 2024
HTML

tianyi-lab / HallusionBench

Star

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

benchmark benchmarks lmm hallucination gpt-4 large-language-models llm llava large-vision-language-models vlms gpt-4v

Updated Mar 17, 2024
Python

Improve this page

Add a description, image, and links to the hallucination topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hallucination topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hallucination

Here are 36 public repositories matching this topic...

NishilBalar / Awesome-LVLM-Hallucination

amazon-science / RefChecker

zjunlp / EasyDetect

Libr-AI / OpenFactVerification

zjunlp / KnowledgeCircuits

deshwalmahesh / PHUDGE

zjunlp / NLPCC2024_RegulatingLLM

jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness

IAAR-Shanghai / UHGEval

vr25 / lrec-coling-hallucination-tutorial

weijiaheng / CHALE

robertbenson / openai_assistant_code_interpreter

zjunlp / FactCHD

CrackedResearcher / LLMVerify

yfzhang114 / LLaVA-Align

qqplot / dcpmi

ictnlp / TruthX

xieyuquanxx / awesome-Large-MultiModal-Hallucination

anlp-team / LTI_Neural_Navigator

tianyi-lab / HallusionBench

Improve this page

Add this topic to your repo