@singhsidhukuldeep on Hugging Face: "Groundbreaking Research Alert: Correctness ≠ Faithfulness in RAG Systems…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

singhsidhukuldeep

posted an update 26 days ago

Post

1446

Groundbreaking Research Alert: Correctness ≠ Faithfulness in RAG Systems

Fascinating new research from L3S Research Center, University of Amsterdam, and TU Delft reveals a critical insight into Retrieval Augmented Generation (RAG) systems. The study exposes that up to 57% of citations in RAG systems could be unfaithful, despite being technically correct.

>> Key Technical Insights:

Post-rationalization Problem
The researchers discovered that RAG systems often engage in "post-rationalization" - where models first generate answers from their parametric memory and then search for supporting evidence afterward. This means that while citations may be correct, they don't reflect the actual reasoning process.

Experimental Design
The team used Command-R+ (104B parameters) with 4-bit quantization on NVIDIA A100 GPU, testing on the NaturalQuestions dataset. They employed BM25 for initial retrieval and ColBERT v2 for reranking.

Attribution Framework
The research introduces a comprehensive framework for evaluating RAG systems across multiple dimensions:
- Citation Correctness: Whether cited documents support the claims
- Citation Faithfulness: Whether citations reflect actual model reasoning
- Citation Appropriateness: Relevance and meaningfulness of citations
- Citation Comprehensiveness: Coverage of key points

Under the Hood
The system processes involve:
1. Document relevance prediction
2. Citation prediction
3. Answer generation without citations
4. Answer generation with citations

This work fundamentally challenges our understanding of RAG systems and highlights the need for more robust evaluation metrics in AI systems that claim to provide verifiable information.

csabakecskemeti

26 days ago

and there is the paper:
https://www.alphaxiv.org/abs/2412.18004

csabakecskemeti

26 days ago

seems it's happening:
ChatGPT
I've provided context that has no information about if Berlin is the capital of Germany, though my 'fake' source has been cited.

In this post