--- license: other license_name: nvclv1 license_link: LICENSE datasets: - ILSVRC/imagenet-1k pipeline_tag: image-classification --- [**MambaVision: A Hybrid Mamba-Transformer Vision Backbone**](https://arxiv.org/abs/2407.08083). ### Model Overview We introduce a novel mixer block by creating a symmetric path without SSM to enhance the modeling of global context. MambaVision has a hierarchical architecture that employs both self-attention and mixer blocks. ### Model Performance MambaVision demonstrates a strong performance by achieving a new SOTA Pareto-front in terms of Top-1 accuracy and throughput.

### Model Usage You must first login into HuggingFace to pull the model: ```Bash huggingface-cli login ``` The model can be simply used according to: ```Python from transformers import AutoModelForImageClassification model = AutoModelForImageClassification.from_pretrained("nvidia/MambaVision-T-1K", trust_remote_code=True) ``` ### License: [NVIDIA Source Code License-NC](https://huggingface.co/nvidia/MambaVision-T-1K/blob/main/LICENSE)