Update README.md
Browse files
README.md
CHANGED
@@ -31,19 +31,16 @@ widget:
|
|
31 |
# EraX-VL-2B-V1.5
|
32 |
## Introduction 🎉
|
33 |
|
34 |
-
|
35 |
|
36 |
One standing-out feature of **EraX-VL-2B-V1.5** is the capability to do multi-turn Q&A with reasonable reasoning capability at its small size of only +2 billions parameters.
|
37 |
|
38 |
-
***NOTA BENE***:
|
|
|
|
|
39 |
|
40 |
**EraX-VL-2B-V1.5** is a young and tiny member of our **EraX's LànhGPT** collection of LLM models.
|
41 |
|
42 |
-
- **Developed by:**
|
43 |
-
- Nguyễn Anh Nguyên ([email protected])
|
44 |
-
- Nguyễn Hồ Nam (BCG)
|
45 |
-
- Phạm Đình Thục ([email protected])
|
46 |
-
- **Funded by:** [Bamboo Capital Group](https://bamboocap.com.vn) and EraX
|
47 |
- **Model type:** Multimodal Transformer with over 2B parameters
|
48 |
- **Languages (NLP):** Primarily Vietnamese with multilingual capabilities
|
49 |
- **License:** Apache 2.0
|
|
|
31 |
# EraX-VL-2B-V1.5
|
32 |
## Introduction 🎉
|
33 |
|
34 |
+
After the warmly welcomed of **<a href="https://huggingface.co/erax-ai/EraX-VL-7B-V1.0" target="_blank">EraX-VL-7B-V1.0 model</a>**, we are excited to introduce **EraX-VL-2B-V1.5**, a robust multimodal model for **OCR (optical character recognition)** and **VQA (visual question-answering)** that excels in various languages 🌍, with a particular focus on **Vietnamese 🇻🇳**. The `EraX-VL-2B` model stands out for its precise recognition capabilities across a range of documents 📝, including medical forms 🩺, invoices 🧾, bills of sale 💳, quotes 📄, and medical records 💊. This functionality is expected to be highly beneficial for hospitals 🏥, clinics 💉, insurance companies 🛡️, and other similar applications 📋. Built on the solid foundation of the [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct)[1], which we found to be of high quality and fluent in Vietnamese, `EraX-VL-2B` has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
|
35 |
|
36 |
One standing-out feature of **EraX-VL-2B-V1.5** is the capability to do multi-turn Q&A with reasonable reasoning capability at its small size of only +2 billions parameters.
|
37 |
|
38 |
+
***NOTA BENE***:
|
39 |
+
- EraX-VL-2B-V1.5 is NOT a typical OCR-only tool likes Tesseract but is a Multimodal LLM-based model. To use it effectively, you may have to **twist your prompt carefully** depending on your tasks.
|
40 |
+
- This model was NOT finetuned with medical (X-ray) dataset or car accidences (yet). Stay tune for updated version coming up sometime 2025.
|
41 |
|
42 |
**EraX-VL-2B-V1.5** is a young and tiny member of our **EraX's LànhGPT** collection of LLM models.
|
43 |
|
|
|
|
|
|
|
|
|
|
|
44 |
- **Model type:** Multimodal Transformer with over 2B parameters
|
45 |
- **Languages (NLP):** Primarily Vietnamese with multilingual capabilities
|
46 |
- **License:** Apache 2.0
|