czczup commited on
Commit
d980280
1 Parent(s): 61d9457

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -6
README.md CHANGED
@@ -17,22 +17,19 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
17
 
18
  It is _**the largest open-source vision/vision-language foundation model (14B)**_ to date, achieving _**32 state-of-the-art**_ performances on a wide range of tasks such as visual perception, cross-modal retrieval, multimodal dialogue, etc.
19
 
20
- # Model Zoo
21
-
22
- ## Pretrained Weights
23
 
24
  | model name | type | download | size |
25
  | ----------------------- | ------- | ---------------------------------------------------------------------------------------------- | :-----: |
26
  | InternViT-6B-224px | pytorch | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL/blob/main/intern_vit_6b_224px.pth) | 12 GB |
27
 
28
- ## Linear-Probe Image Classification
29
 
30
  | model name | IN-1K | IN-ReaL | IN-V2 | IN-A | IN-R | IN-Sketch | download |
31
  | ------------------ | :---: | :-----: | :---: | :--: | :--: | :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
32
  | InternViT-6B-224px | 88.2 | 90.4 | 79.9 | 77.5 | 89.8 | 69.1 | [ckpt](https://huggingface.co/OpenGVLab/InternVL/resolve/main/intern_vit_6b_224px_head.pth) \| [log](https://github.com/OpenGVLab/InternVL/blob/main/classification/work_dirs/intern_vit_6b_1k_224/log_rank0.txt) |
33
 
34
-
35
- ## Semantic Segmentation
36
 
37
  | type | backbone | head | mIoU | config | download |
38
  | --------------- | --------------------- | :-----: | :--: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 
17
 
18
  It is _**the largest open-source vision/vision-language foundation model (14B)**_ to date, achieving _**32 state-of-the-art**_ performances on a wide range of tasks such as visual perception, cross-modal retrieval, multimodal dialogue, etc.
19
 
20
+ # Pretrained Weights
 
 
21
 
22
  | model name | type | download | size |
23
  | ----------------------- | ------- | ---------------------------------------------------------------------------------------------- | :-----: |
24
  | InternViT-6B-224px | pytorch | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL/blob/main/intern_vit_6b_224px.pth) | 12 GB |
25
 
26
+ # Linear-Probe Image Classification
27
 
28
  | model name | IN-1K | IN-ReaL | IN-V2 | IN-A | IN-R | IN-Sketch | download |
29
  | ------------------ | :---: | :-----: | :---: | :--: | :--: | :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
30
  | InternViT-6B-224px | 88.2 | 90.4 | 79.9 | 77.5 | 89.8 | 69.1 | [ckpt](https://huggingface.co/OpenGVLab/InternVL/resolve/main/intern_vit_6b_224px_head.pth) \| [log](https://github.com/OpenGVLab/InternVL/blob/main/classification/work_dirs/intern_vit_6b_1k_224/log_rank0.txt) |
31
 
32
+ # Semantic Segmentation
 
33
 
34
  | type | backbone | head | mIoU | config | download |
35
  | --------------- | --------------------- | :-----: | :--: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |