You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Llamba Models

The Llamba models are part of Cartesia's Edge library, designed for efficient, high-performance machine learning applications.

For more details, refer to the paper.


Usage

Llamba on PyTorch

To use Llamba with PyTorch:

  1. Install the required package:
pip install --no-binary :all: cartesia-pytorch
  1. Load and run the model
from transformers import AutoTokenizer
from cartesia_pytorch.Llamba.llamba import LlambaLMHeadModel

model = LlambaLMHeadModel.from_pretrained("AvivBick/Llamba-1B", strict=True).to('cuda')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
input_ids = tokenizer("Hello, my name is", return_tensors="pt").input_ids
input_ids = input_ids.to('cuda')
output = model.generate(input_ids, max_length=100)[0]
print(tokenizer.decode(output, skip_special_tokens=True))

Llamba on MLX

To run Llamba with the Metal framework:
(Add specific instructions here when available.)


Evaluations

Details on model performance, benchmarks, and evaluation metrics can be found in the paper link.
(Expand on this section if specific results or datasets are available.)

Downloads last month
30
Safetensors
Model size
1.41B params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .