Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28, 2024 • 23
Make It Count: Text-to-Image Generation with an Accurate Number of Objects Paper • 2406.10210 • Published Jun 14, 2024 • 77