UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published 2 days ago • 16
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 2 days ago • 11
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation Paper • 2502.20388 • Published 2 days ago • 7