DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations Paper • 2410.18860 • Published Oct 24, 2024 • 9
Analysing the Residual Stream of Language Models Under Knowledge Conflicts Paper • 2410.16090 • Published Oct 21, 2024 • 7
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering Paper • 2410.15999 • Published Oct 21, 2024 • 19
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9, 2024 • 39
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21, 2024 • 28
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 101 items • Updated about 3 hours ago • 97
view article Article Introducing RWKV — An RNN with the advantages of a transformer May 15, 2023 • 14
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation Paper • 2406.13663 • Published Jun 19, 2024 • 7
A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression Paper • 2406.11430 • Published Jun 17, 2024 • 23
view article Article The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models Jan 29, 2024 • 18