FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published 26 days ago • 13
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions Paper • 2407.06723 • Published Jul 9, 2024 • 11