Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper ā¢ 2412.15322 ā¢ Published Dec 19, 2024 ā¢ 18
BhasaAnuvaad Collection A Speech Translation Dataset for 13 Indian Languages ā¢ 11 items ā¢ Updated 11 days ago ā¢ 14
SongCreator: Lyrics-based Universal Song Generation Paper ā¢ 2409.06029 ā¢ Published Sep 9, 2024 ā¢ 22