kanashi6
/

GiT

kanashi6 commited on Mar 8, 2024

Commit

d5f3759

verified ·

1 Parent(s): 7cc2efb

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ In this project, we introduce GiT (Generalist Vision Transformer). GiT has the f
 - 😮 **Minimalist architecture design similar to LLM**: GiT consists solely of a single transformer, without the inclusion of additional vision encoder and adapter.
 - 🚀 **Covering all types of visual understanding tasks**: GiT addresses a spectrum of visual tasks, including object-level tasks (e.g., objecte detection), pixel-level tasks (e.g., semantic segmentation) and vision-language tasks (e.g., image captioning).
 - 🤗 **Achieving task synergy by unified language interface**: Similar to LLM, GiT observes task synergy effect in multi-task training.
-- 🔥 **SOTA performance on zero-shot and few-shot benchmark**: GiT scales well with model size and data, demonstrating remarkable generalizability across diverse scenarios after trained on 27 datasets.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585493b53c37507639fe3ba/0-qINMmUF8ugjb2jdsHLa.png)

 - 😮 **Minimalist architecture design similar to LLM**: GiT consists solely of a single transformer, without the inclusion of additional vision encoder and adapter.
 - 🚀 **Covering all types of visual understanding tasks**: GiT addresses a spectrum of visual tasks, including object-level tasks (e.g., objecte detection), pixel-level tasks (e.g., semantic segmentation) and vision-language tasks (e.g., image captioning).
 - 🤗 **Achieving task synergy by unified language interface**: Similar to LLM, GiT observes task synergy effect in multi-task training.
+- 🔥 **Strong performance on zero-shot and few-shot benchmark**: GiT scales well with model size and data, demonstrating remarkable generalizability across diverse scenarios after trained on 27 datasets.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585493b53c37507639fe3ba/0-qINMmUF8ugjb2jdsHLa.png)