|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- mamba |
|
- mlx |
|
- cartesia |
|
--- |
|
|
|
# Model Card for mamba2-2.7b-4bit-mlx |
|
|
|
This is an [MLX](https://ml-explore.github.io/mlx)-compatible version of the [mamba2-2.7b](https://huggingface.co/state-spaces/mamba2-2.7b) model, quantized to 4 bits. It uses the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer. |
|
For more details, see our [blog post](https://cartesia.ai/blog/on-device). |
|
|
|
## Usage |
|
### Installation |
|
This model requires the `cartesia-metal` and `cartesia-mlx` packages. |
|
|
|
Installation requires Xcode, which can be downloaded from https://developer.apple.com/xcode/. Accept the license agreement with: |
|
```shell |
|
sudo xcodebuild -license |
|
``` |
|
|
|
Install the required dependencies: the exact version of `nanobind`, followed by `cartesia-metal`, and finally `cartesia-mlx`, with the following commands: |
|
```shell |
|
pip install nanobind@git+https://github.com/wjakob/nanobind.git@2f04eac452a6d9142dedb957701bdb20125561e4 |
|
pip install git+https://github.com/cartesia-ai/edge.git#subdirectory=cartesia-metal |
|
pip install cartesia-mlx |
|
``` |
|
|
|
Note: This package has been tested on macOS Sonoma 14.1 with the M3 chip. |
|
|
|
### Generation example |
|
```python |
|
import mlx.core as mx |
|
import cartesia_mlx as cmx |
|
|
|
model = cmx.from_pretrained("cartesia-ai/mamba2-2.7b-4bit-mlx") |
|
model.set_dtype(mx.float32) |
|
|
|
prompt = "Rene Descartes was" |
|
|
|
print(prompt, end="", flush=True) |
|
for text in model.generate( |
|
prompt, |
|
max_tokens=500, |
|
eval_every_n=5, |
|
verbose=True, |
|
top_p=0.99, |
|
temperature=0.85, |
|
): |
|
print(text, end="", flush=True) |
|
``` |
|
|
|
## About Cartesia |
|
At [Cartesia](https://cartesia.ai/), we're building real-time multimodal intelligence for every device. |
|
|