lbourdois/caption-maya-multimodal-pretrain-clean
Viewer
•
Updated
•
551k
•
165
Datasets I cleaned with an image, a prompt question (like "describe this image") and an answer. Can be used to train VLMs.