MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14, 2024 • 126
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset Paper • 2403.09029 • Published Mar 14, 2024 • 55
GiT: Towards Generalist Vision Transformer through Universal Language Interface Paper • 2403.09394 • Published Mar 14, 2024 • 26