Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ Try out our [web demo 🚀](http://imagebind-llm.opengvlab.com/) here!
|
|
16 |
|
17 |
We present SPHINX, a versatile multi-modal large language model (MLLM) with a mixer of training tasks, data domains, and visual embeddings.
|
18 |
|
19 |
-
- **Task Mix.** For all-purpose capabilities, we mix a variety of vision-language tasks for mutual improvement: VQA, REC, REG, OCR, etc.
|
20 |
|
21 |
- **Embedding Mix.** We capture robust visual representations by fusing distinct visual architectures, pre-training, and granularity.
|
22 |
|
|
|
16 |
|
17 |
We present SPHINX, a versatile multi-modal large language model (MLLM) with a mixer of training tasks, data domains, and visual embeddings.
|
18 |
|
19 |
+
- **Task Mix.** For all-purpose capabilities, we mix a variety of vision-language tasks for mutual improvement: VQA, REC, REG, OCR, DET, POSE, REL DET, T2I, etc.
|
20 |
|
21 |
- **Embedding Mix.** We capture robust visual representations by fusing distinct visual architectures, pre-training, and granularity.
|
22 |
|