Alpha-VLLM
/

SPHINX

void0721 commited on Nov 3, 2023

Commit

f03b15c

1 Parent(s): 64a1f4a

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Try out our [web demo 🚀](http://imagebind-llm.opengvlab.com/) here!
 We present SPHINX, a versatile multi-modal large language model (MLLM) with a mixer of training tasks, data domains, and visual embeddings.
-- **Task Mix.** For all-purpose capabilities, we mix a variety of vision-language tasks for mutual improvement: VQA, REC, REG, OCR, etc.
 - **Embedding Mix.** We capture robust visual representations by fusing distinct visual architectures, pre-training, and granularity.

 We present SPHINX, a versatile multi-modal large language model (MLLM) with a mixer of training tasks, data domains, and visual embeddings.
+- **Task Mix.** For all-purpose capabilities, we mix a variety of vision-language tasks for mutual improvement: VQA, REC, REG, OCR, DET, POSE, REL DET, T2I, etc.
 - **Embedding Mix.** We capture robust visual representations by fusing distinct visual architectures, pre-training, and granularity.