view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain โข 6 days ago โข 23
Open LLM Leaderboard best models โค๏ธโ๐ฅ Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: โข 63 items โข Updated 1 day ago โข 523
Lost in the Middle: How Language Models Use Long Contexts Paper โข 2307.03172 โข Published Jul 6, 2023 โข 38
Cut Your Losses in Large-Vocabulary Language Models Paper โข 2411.09009 โข Published Nov 13, 2024 โข 44