DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations Paper • 2410.18860 • Published Oct 24 • 9
Analysing the Residual Stream of Language Models Under Knowledge Conflicts Paper • 2410.16090 • Published Oct 21 • 7
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering Paper • 2410.15999 • Published Oct 21 • 19
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9 • 38
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21 • 25
🔍 Daily Picks in Interpretability & Analysis of LMs Collection Outstanding research in interpretability and evaluation of language models, summarized • 90 items • Updated 5 days ago • 93
view article Article Introducing RWKV — An RNN with the advantages of a transformer May 15, 2023 • 14
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation Paper • 2406.13663 • Published Jun 19 • 7
A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression Paper • 2406.11430 • Published Jun 17 • 22
view article Article The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models Jan 29 • 17