Llamba-1B / README.md
AvivBick's picture
Update README.md
459c60b verified

Llamba Models

The Llamba models are part of Cartesia's Edge library, designed for efficient, high-performance machine learning applications.

For more details, refer to the paper.


Usage

Llamba on PyTorch

To use Llamba with PyTorch:

  1. Install the required package:
pip install --no-binary :all: cartesia-pytorch
  1. Load and run the model
from transformers import AutoTokenizer
from cartesia_pytorch.Llamba.llamba import LlambaLMHeadModel

model = LlambaLMHeadModel.from_pretrained("AvivBick/Llamba-1B", strict=True).to('cuda')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
input_ids = tokenizer("Hello, my name is", return_tensors="pt").input_ids
input_ids = input_ids.to('cuda')
output = model.generate(input_ids, max_length=100)[0]
print(tokenizer.decode(output, skip_special_tokens=True))

Llamba on MLX

To run Llamba with the Metal framework:
(Add specific instructions here when available.)


Evaluations

Details on model performance, benchmarks, and evaluation metrics can be found in the paper link.
(Expand on this section if specific results or datasets are available.)