OpenGVLab
/

InternVL-Chat-V1-5

Image-Text-to-Text

feature-extraction

Model card Files Files and versions Metrics Training metrics Community

czczup commited on Apr 20, 2024

Commit

1f98fd5

·

verified ·

1 Parent(s): 8c16a21

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ pipeline_tag: visual-question-answering
 ## Model Details
 - **Model Type:** vision large language model, multimodal chatbot
 - **Model Stats:**
-  - Architecture: [InternViT-6B-448px-V1-5](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2) + MLP + [InternLM2-Chat-20B](https://huggingface.co/internlm/internlm2-chat-20b)
   - Params: 25.5B
   - Image size: dynamic resolution, max to 40 tiles of 448 x 448 during inference.
   - Number of visual tokens: 256 * (number of tiles + 1)

 ## Model Details
 - **Model Type:** vision large language model, multimodal chatbot
 - **Model Stats:**
+  - Architecture: [InternViT-6B-448px-V1-5](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5) + MLP + [InternLM2-Chat-20B](https://huggingface.co/internlm/internlm2-chat-20b)
   - Params: 25.5B
   - Image size: dynamic resolution, max to 40 tiles of 448 x 448 during inference.
   - Number of visual tokens: 256 * (number of tiles + 1)