Can Community Notes Replace Professional Fact-Checkers? Paper ā¢ 2502.14132 ā¢ Published 18 days ago ā¢ 5
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper ā¢ 2412.03555 ā¢ Published Dec 4, 2024 ā¢ 129
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture Paper ā¢ 2406.11030 ā¢ Published Jun 16, 2024
Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning Paper ā¢ 2406.02265 ā¢ Published Jun 4, 2024 ā¢ 7
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper ā¢ 2404.16820 ā¢ Published Apr 25, 2024 ā¢ 17
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models Paper ā¢ 2311.07022 ā¢ Published Nov 13, 2023 ā¢ 1
Text Rendering Strategies for Pixel Language Models Paper ā¢ 2311.00522 ā¢ Published Nov 1, 2023 ā¢ 12
Text Rendering Strategies for Pixel Language Models Paper ā¢ 2311.00522 ā¢ Published Nov 1, 2023 ā¢ 12
Text Rendering Strategies for Pixel Language Models Paper ā¢ 2311.00522 ā¢ Published Nov 1, 2023 ā¢ 12
Text Rendering Strategies for Pixel Language Models Paper ā¢ 2311.00522 ā¢ Published Nov 1, 2023 ā¢ 12
Measuring Progress in Fine-grained Vision-and-Language Understanding Paper ā¢ 2305.07558 ā¢ Published May 12, 2023 ā¢ 1