kanashi6
/

GiT

Model card Files Files and versions Community

kanashi6 commited on Mar 2, 2024

Commit

01fb063

verified ·

1 Parent(s): 8f538aa

Update README.md

Browse files

Files changed (1) hide show

README.md +42 -0

README.md CHANGED Viewed

@@ -1,3 +1,45 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+language:
+- en
 ---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+[Beyond Language: Multi-layer Transformer is a General Visual Learner](https://arxiv.org/abs/2222.33333)
+This repository includes ViM checkpoints, logs, and the pre-trained files used.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+In this project, we introduce ViM (Laqurge Visual Modeling). ViM has the following characteristics:
+- 😮 **Minimalist architecture design similar to LLM**: ViM consists solely of a single transformer, without the inclusion of additional vision encoder and adapter.
+- 🚀 **Covering all types of visual understanding tasks**: ViM addresses a spectrum of visual tasks, including object-level tasks (e.g., objecte detection), pixel-level tasks (e.g., semantic segmentation) and vision-language tasks (e.g., image captioning).
+- 🤗 **Achieving task synergy by unified language interface**: Similar to LLM, ViM observes task synergy effect in multi-task training.
+- 🔥 **SOTA performance on zero-shot and few-shot benchmark**: ViM scales well with model size and data, demonstrating remarkable generalizability across diverse scenarios after trained on 27 datasets.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585493b53c37507639fe3ba/FEhRT9ZscNwG7xIYYIYmh.png)
+- **Developed by:** Haiyang Wang ( [email protected] ), Hao Tang ( [email protected] )
+- **License:** [Apache license 2.0]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/Haiyang-W/ViM
+- **Paper [optional]:** https://arxiv.org/abs/2222.33333
+## Uses
+Please refer [here](https://github.com/Haiyang-W/ViM) for more detail about usage.