Image-Text-to-Text
Safetensors
xtuner
llava-internlm2-7b / README.md
osanseviero's picture
Fix metadata to have right library and task
455cec0 verified
|
raw
history blame
3.88 kB
metadata
datasets:
  - liuhaotian/LLaVA-Pretrain
  - liuhaotian/LLaVA-Instruct-150K
pipeline_tag: image-text-to-text
library_name: xtuner

Generic badge

Model

llava-internlm2-7b is a LLaVA model fine-tuned from InternLM2-Chat-7B and CLIP-ViT-Large-patch14-336 with LLaVA-Pretrain and LLaVA-Instruct by XTuner.

Results

Model MMBench Test (EN) MMBench Dev (EN) MMBench Test (CN) MMBench Dev (CN) CCBench Dev MME SEEDBench_IMG MMVet MMMU Dev MathVista MiniTest HallusionBench aAcc
LLaVA-v1.5-7B (XTuner) 67.7 69.2 61.0 59.7 28.4 1716 66.4 32.2 33.7 24.2 46.2
LLaVA-v1.5-13B (XTuner) 68.8 69.5 64.7 63.1 32.9 1766 67.9 35.9 35.2 26.2 46.9
LLaVA-InternLM-7B (XTuner) 69.0 68.5 66.7 63.8 37.3 1637 65.7 32.4 36.9 26.3 49.1
LLaVA-InternLM2-7B 73.3 74.6 71.7 72.0 42.5 1700 71.2 35.9 40.1 25.5 46.8
LLaVA-InternLM2-20B 75.1 73.5 73.7 72.8 46.3 1868 70.2 37.2 39.4 24.6 47.7

Quickstart

Installation

pip install -U 'xtuner[deepspeed]'

Chat

xtuner chat internlm/internlm2-chat-7b \
  --visual-encoder openai/clip-vit-large-patch14-336 \
  --llava xtuner/llava-internlm2-7b \
  --prompt-template internlm2_chat \
  --image $IMAGE_PATH

Training

  1. Alignment module pretraining (saved by default in ./work_dirs/)
NPROC_PER_NODE=8 xtuner train llava_internlm2_chat_7b_clip_vit_large_p14_336_e1_gpu8_pretrain --deepspeed deepspeed_zero2
  1. Instruction following fine-tuning (saved by default in ./work_dirs/)
NPROC_PER_NODE=8 xtuner train llava_internlm2_chat_7b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune --deepspeed deepspeed_zero2

MMBench Evaluation

XTuner integrates the MMBench evaluation, and you can perform evaluations with the following command!

xtuner mmbench internlm/internlm2-chat-7b \
  --visual-encoder openai/clip-vit-large-patch14-336 \
  --llava xtuner/llava-internlm2-7b \
  --prompt-template internlm2_chat \
  --data-path $MMBENCH_DATA_PATH \
  --work-dir $RESULT_PATH

After the evaluation is completed, if it's a development set, it will directly print out the results; If it's a test set, you need to submit mmbench_result.xlsx to the official MMBench for final evaluation to obtain precision results!

Citation

@misc{2023xtuner,
    title={XTuner: A Toolkit for Efficiently Fine-tuning LLM},
    author={XTuner Contributors},
    howpublished = {\url{https://github.com/InternLM/xtuner}},
    year={2023}
}