MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 38
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation Paper • 2501.03225 • Published 19 days ago • 7
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper • 2501.05707 • Published 16 days ago • 19