Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages Paper • 2410.16153 • Published 23 days ago • 42
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published 24 days ago • 56
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published 28 days ago • 30
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks Paper • 2410.01744 • Published Oct 2 • 25
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models Paper • 2410.14059 • Published 27 days ago • 52
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25 • 100
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published 8 days ago • 41