You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Llamba Models

The Llamba models are part of Cartesia's Edge library, designed for efficient, high-performance machine learning applications.

For more details, refer to the paper.


Usage

Llamba on PyTorch

To use Llamba with PyTorch:

  1. Install the required package:
pip install --no-binary :all: cartesia-pytorch
  1. Load and run the model
from transformers import AutoTokenizer
from cartesia_pytorch.Llamba.llamba import LlambaLMHeadModel

model = LlambaLMHeadModel.from_pretrained("AvivBick/Llamba-3B", strict=True).to('cuda')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B")
input_ids = tokenizer("Hello, my name is", return_tensors="pt").input_ids
input_ids = input_ids.to('cuda')
output = model.generate(input_ids, max_length=100)[0]
print(tokenizer.decode(output, skip_special_tokens=True))

Llamba on MLX

To run Llamba with the Metal framework:
(Add specific instructions here when available.)


Evaluations

Details on model performance, benchmarks, and evaluation metrics can be found in the paper link.
(Expand on this section if specific results or datasets are available.)

Downloads last month
18
Safetensors
Model size
3.66B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.