view article Article ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models By ahmed-masry • Oct 18, 2024 • 16
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 225
Video Language Models Collection A collection of video-language models • 5 items • Updated Aug 1, 2024 • 2
Vision Language Leaderboards Collection This collection has all the vision language leaderboards. • 7 items • Updated Aug 24, 2024 • 15
Depth Anything v2 Release Collection A comprehensive collection on DAv2 • 5 items • Updated Jun 18, 2024 • 11
Vision Language Models Papers 🖼️💬📝 Collection Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated Apr 30, 2024 • 35
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11, 2024 • 76
OWL-series 🦉 Collection Models and applications of OWL-ViT and OWLv2. • 13 items • Updated Mar 11, 2024 • 6
plant-image-datasets Collection Image datasets about the kingdom Plantae. • 4 items • Updated Feb 29, 2024 • 2
Zero-shot Image Classification Models 🖼️ Collection This is a collection for models that can be used for zero-shot image classification. • 10 items • Updated Sep 19, 2023 • 2