StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following Paper • 2502.14494 • Published 5 days ago • 13
Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models Paper • 2502.15086 • Published 5 days ago • 14
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 5 days ago • 33
SIFT: Grounding LLM Reasoning in Contexts via Stickers Paper • 2502.14922 • Published 6 days ago • 28
EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the Wild Paper • 2502.14892 • Published 9 days ago • 3
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published 5 days ago • 7
REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation Paper • 2502.13270 • Published 7 days ago • 6
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 5 days ago • 91
view article Article Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time By rbrt and 4 others • 7 days ago • 19
Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages Paper • 2502.10140 • Published 11 days ago • 9
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model Paper • 2502.08820 • Published 13 days ago • 4
Tiny-Agent-a Collection fast and powerful agentic models designed to run on edge devices. • 6 items • Updated 13 days ago • 7
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 13 days ago • 141