AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 71
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 10 items • Updated 29 days ago • 55
BRAVE: Broadening the visual encoding of vision-language models Paper • 2404.07204 • Published Apr 10, 2024 • 19
view article Article Llama can now see and run on your device - welcome Llama 3.2 Sep 25, 2024 • 182
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 566
NIM Serverless Inference API Collection Models in this collection are available for inference via a serverless API powered by NVIDIA NIM. • 8 items • Updated 21 days ago • 23
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 202
InternVL2.0 Collection Expanding Performance Boundaries of Open-Source MLLM • 15 items • Updated 28 days ago • 89