Papers
arxiv:2502.17125

LettuceDetect: A Hallucination Detection Framework for RAG Applications

Published on Feb 24
Β· Submitted by adaamko on Mar 3
Authors:

Abstract

Retrieval Augmented Generation (RAG) systems remain vulnerable to hallucinated answers despite incorporating external knowledge sources. We present LettuceDetect a framework that addresses two critical limitations in existing hallucination detection methods: (1) the context window constraints of traditional encoder-based methods, and (2) the computational inefficiency of LLM based approaches. Building on ModernBERT's extended context capabilities (up to 8k tokens) and trained on the RAGTruth benchmark dataset, our approach outperforms all previous encoder-based models and most prompt-based models, while being approximately 30 times smaller than the best models. LettuceDetect is a token-classification model that processes context-question-answer triples, allowing for the identification of unsupported claims at the token level. Evaluations on the RAGTruth corpus demonstrate an F1 score of 79.22% for example-level detection, which is a 14.8% improvement over Luna, the previous state-of-the-art encoder-based architecture. Additionally, the system can process 30 to 60 examples per second on a single GPU, making it more practical for real-world RAG applications.

Community

Paper author Paper submitter

We released π—Ÿπ—²π˜π˜π˜‚π—°π—²π——π—²π˜π—²π—°π˜, a lightweight hallucination detection framework for Retrieval-Augmented Generation (RAG) pipelines.

LettuceDetect addresses two critical challenges:

The π—°π—Όπ—»π˜π—²π˜…π˜-π˜„π—Άπ—»π—±π—Όπ˜„ π—Ήπ—Άπ—Ίπ—Άπ˜π˜€ in prior encoder-only models.
The 𝗡𝗢𝗴𝗡 π—°π—Όπ—Ίπ—½π˜‚π˜π—² π—°π—Όπ˜€π˜π˜€ associated with LLM-based detectors.

Built on π— π—Όπ—±π—²π—Ώπ—»π—•π—˜π—₯𝗧, our encoder-based model is released under the π— π—œπ—§ license and comes with ready-to-use Python packages and pretrained models.

Sign up or log in to comment

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.17125 in a dataset README.md to link it from this page.

Spaces citing this paper 2

Collections including this paper 2