Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 26 days ago • 70
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 11 days ago • 48
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens Paper • 2501.07730 • Published 5 days ago • 16
PokerBench: Training Large Language Models to become Professional Poker Players Paper • 2501.08328 • Published 5 days ago • 13