Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity Paper • 2407.10387 • Published Jul 15 • 6
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 2 days ago • 33