|
# Llamba Models |
|
|
|
The Llamba models are part of Cartesia's [Edge](https://github.com/cartesia-ai/edge) library, designed for efficient, high-performance machine learning applications. |
|
|
|
For more details, refer to the [paper](#). |
|
|
|
--- |
|
## Usage |
|
|
|
### Llamba on PyTorch |
|
|
|
To use Llamba with PyTorch: |
|
|
|
1. Install the required package: |
|
```bash |
|
pip install --no-binary :all: cartesia-pytorch |
|
``` |
|
2. Load and run the model |
|
```python |
|
from transformers import AutoTokenizer |
|
from cartesia_pytorch.Llamba.llamba import LlambaLMHeadModel |
|
|
|
model = LlambaLMHeadModel.from_pretrained("AvivBick/Llamba-1B", strict=True).to('cuda') |
|
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B") |
|
input_ids = tokenizer("Hello, my name is", return_tensors="pt").input_ids |
|
input_ids = input_ids.to('cuda') |
|
output = model.generate(input_ids, max_length=100)[0] |
|
print(tokenizer.decode(output, skip_special_tokens=True)) |
|
``` |
|
|
|
### Llamba on MLX |
|
|
|
To run Llamba with the Metal framework: |
|
_(Add specific instructions here when available.)_ |
|
|
|
--- |
|
### Evaluations |
|
|
|
Details on model performance, benchmarks, and evaluation metrics can be found in the [paper link](#). |
|
_(Expand on this section if specific results or datasets are available.)_ |