FAST: Efficient Action Tokenization for Vision-Language-Action Models Paper • 2501.09747 • Published 14 days ago • 23
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Paper • 2501.06842 • Published 18 days ago • 15
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published 23 days ago • 48
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 35
iFormer: Integrating ConvNet and Transformer for Mobile Application Paper • 2501.15369 • Published 5 days ago • 9
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Paper • 2501.12370 • Published 9 days ago • 8
Return of the Encoder: Maximizing Parameter Efficiency for SLMs Paper • 2501.16273 • Published 3 days ago • 4