rhymes-ai
/

Aria

+---
+license: apache-2.0
+language:
+- en
+library_name: transformers
+tags:
+- multimodal
+- aria
+---
+<p align="center">
+  <br>Aria</br>
+</p>
+<p align="center">
+🔗 <a href="https://huggingface.co" target="_blank"> Try Aria!</a> · 📖 <a href="https://huggingface.co" target="_blank">Blog</a> · 📌 <a href="https://huggingface.co" target="_blank">Paper</a> ·
+ ·🖤 <a href="https://huggingface.co" target="_blank">GitHub</a>  💜 <a href="https://huggingface.co" target="_blank">Discord</a>
+· 💙 <a href="https://huggingface.co" target="_blank">Twitter</a>
+</p>
+# Highlights
+- Aria is the **first open multimodal native MoE** model, capable of seamlessly handling various input modalities within a MoE architecture.
+- Aria performs **on par with GPT-4o mini and Gemini 1.5 Flash** across a range of multimodal tasks while maintaining strong performance on **text**-only tasks.
+- Compared to similar or even larger models, Aria boasts **faster speeds** and **lower costs**. This high efficiency stems from its ability to activate only 3.9B parameters during inference – the **fewest** among models with comparable performance.
+# Key features
+- **Robust multimodal understanding**: Aria processes various input modalities, including video, images, code, and text. It demonstrates strong performance across diverse downstream tasks such as long-context video and image understanding and OCR. Moreover, it excels in instruction following.
+- **Flexible image handling**: Aria supports variable image sizes and aspect ratios while maintaining high quality.
+- **Extended context capacity**: Aria can manage multiple images within a long context window of 64k tokens.
+- **Advanced text understanding**: Aria demonstrates competitive performance across language and coding tasks.
+# Model Info
+| Model  | Download  | Parameter | Context Length |
+| :---- | :------- | :------------ | :------ |
+| Aria | < HF link - TBD> | • Activation: 3.9B (3.5B MoE + 0.4B Visual Encoder) <br> • Total: 25.3B | 64K           |
+# Benchmark
+# Quick Start
+# License
+This repo is released under the Apache 2.0 License.