VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper • 2502.05173 • Published 13 days ago • 60
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper • 2502.04404 • Published 15 days ago • 19
QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation Paper • 2502.05178 • Published 13 days ago • 10