LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published 5 days ago • 133
view article Article What is test-time compute and how to scale it? By Kseniase and 1 other • 19 days ago • 42
view article Article 🦸🏻#9: Does AI Remember? The Role of Memory in Agentic Workflows By Kseniase • 23 days ago • 14
view article Article 🦸🏻#8: Rewriting the Rules of Knowledge: How Modern Agents Learn to Adapt By Kseniase • 26 days ago • 5
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 • 68
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 13 days ago • 91
view article Article 🌁#85: Curiosity, Open Source, and Timing: The Formula Behind DeepSeek’s Phenomenal Success By Kseniase • 29 days ago • 6
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 8 items • Updated 2 days ago • 367