SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs Paper • 2410.13276 • Published 24 days ago • 24
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Paper • 2407.00088 • Published Jun 25 • 10
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation Paper • 2402.10631 • Published Feb 16 • 2