erax-ai
/

EraX-VL-2B-V1.5

@@ -31,13 +31,13 @@ widget:
 # EraX-VL-2B-V1.5
 ## Introduction 🎉
-We are excited to introduce **EraX-VL-2B-V1.5**, a robust multimodal model for **OCR (optical character recognition)** and **VQA (visual question-answering)** that excels in various languages 🌍, with a particular focus on Vietnamese 🇻🇳. The `EraX-VL-2B` model stands out for its precise recognition capabilities across a range of documents 📝, including medical forms 🩺, invoices 🧾, bills of sale 💳, quotes 📄, and medical records 💊. This functionality is expected to be highly beneficial for hospitals 🏥, clinics 💉, insurance companies 🛡️, and other similar applications 📋. Built on the solid foundation of the [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct)[1], which we found to be of high quality and fluent in Vietnamese, `EraX-VL-2B` has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
-One standing-out feature of **EraX-VL-2B-V1.5** is the capability to do multi-turn Q&A with reasonable reasoning capability!
 ***NOTA BENE***: EraX-VL-2B-V1.5 is NOT a typical OCR-only tool likes Tesseract but is a Multimodal LLM-based model. To use it effectively, you may have to **twist your prompt carefully** depending on your tasks.
-**EraX-VL-2B-V1.5** is a young member of our **EraX's LànhGPT** collection of LLM models.
 - **Model type:** Multimodal Transformer with over 2B parameters
 - **Languages (NLP):** Primarily Vietnamese with multilingual capabilities
@@ -56,7 +56,7 @@ One standing-out feature of **EraX-VL-2B-V1.5** is the capability to do multi-tu
     </tr>
     <tr>
         <th align="middle">EraX-VL-7B-V1.5 🥇 </th>
-        <td align="middle">✘</td>
         <td align="middle">47.2 </td>
     </tr>
     <tr>

 # EraX-VL-2B-V1.5
 ## Introduction 🎉
+We are excited to introduce **EraX-VL-2B-V1.5**, a robust multimodal model for **OCR (optical character recognition)** and **VQA (visual question-answering)** that excels in various languages 🌍, with a particular focus on **Vietnamese 🇻🇳**. The `EraX-VL-2B` model stands out for its precise recognition capabilities across a range of documents 📝, including medical forms 🩺, invoices 🧾, bills of sale 💳, quotes 📄, and medical records 💊. This functionality is expected to be highly beneficial for hospitals 🏥, clinics 💉, insurance companies 🛡️, and other similar applications 📋. Built on the solid foundation of the [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct)[1], which we found to be of high quality and fluent in Vietnamese, `EraX-VL-2B` has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
+One standing-out feature of **EraX-VL-2B-V1.5** is the capability to do multi-turn Q&A with reasonable reasoning capability at its small size of only +2 billions parameters.
 ***NOTA BENE***: EraX-VL-2B-V1.5 is NOT a typical OCR-only tool likes Tesseract but is a Multimodal LLM-based model. To use it effectively, you may have to **twist your prompt carefully** depending on your tasks.
+**EraX-VL-2B-V1.5** is a young and tiny member of our **EraX's LànhGPT** collection of LLM models.
 - **Model type:** Multimodal Transformer with over 2B parameters
 - **Languages (NLP):** Primarily Vietnamese with multilingual capabilities
     </tr>
     <tr>
         <th align="middle">EraX-VL-7B-V1.5 🥇 </th>
+        <td align="middle">(soon))</td>
         <td align="middle">47.2 </td>
     </tr>
     <tr>