Model Card for LLaVa-Phi-2-3B

Model Details

Model Description

  • Developed by: LAION, SkunkworksAI & Ontocord
  • Model type: LLaVA is an open-source chatbot trained by fine-tuning Phi-2 on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture
  • Finetuned from model: Phi-2
  • License: MIT

Model Sources

Evaluation

Benchmarks

Model Parameters SQA GQA TextVQA POPE
LLaVA-1.5 7.3B 68.0 62.0 58.3 85.3
MC-LLaVA-3B 3B - 49.6 38.59 -
LLaVA-Phi 3B 68.4 - 48.6 85.0
moondream1 1.6B - 56.3 39.8 -
llava-phi-2-3b 2.7B 69.0 51.2 47.0 86.0
llava-phi-2-3b-siglip 2.7B 70.15% 52.56% 47.99% 87.00%
Downloads last month
14
Safetensors
Model size
2.79B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train marianna13/llava-phi-2-3b-siglip

Collection including marianna13/llava-phi-2-3b-siglip