# Llamba Models The Llamba models are part of Cartesia's [Edge](https://github.com/cartesia-ai/edge) library, designed for efficient, high-performance machine learning applications. For more details, refer to the [paper](#). --- ## Usage ### Llamba on PyTorch To use Llamba with PyTorch: 1. Install the required package: ```bash pip install --no-binary :all: cartesia-pytorch ``` 2. Load and run the model ```python from transformers import AutoTokenizer from cartesia_pytorch.Llamba.llamba import LlambaLMHeadModel model = LlambaLMHeadModel.from_pretrained("AvivBick/Llamba-1B", strict=True).to('cuda') tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B") input_ids = tokenizer("Hello, my name is", return_tensors="pt").input_ids input_ids = input_ids.to('cuda') output = model.generate(input_ids, max_length=100)[0] print(tokenizer.decode(output, skip_special_tokens=True)) ``` ### Llamba on MLX To run Llamba with the Metal framework: _(Add specific instructions here when available.)_ --- ### Evaluations Details on model performance, benchmarks, and evaluation metrics can be found in the [paper link](#). _(Expand on this section if specific results or datasets are available.)_