Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning Paper • 2412.03565 • Published Dec 4, 2024 • 11
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation Paper • 2311.14671 • Published Nov 24, 2023
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs Paper • 2406.04334 • Published Jun 6, 2024
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning Paper • 2311.07574 • Published Nov 13, 2023 • 14
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning Paper • 2311.07574 • Published Nov 13, 2023 • 14