metadata
license: other
xLSTM goes 7B
This xLSTM was pre-trained on the DCLM and selected high-quality data for in a total of approx. 2.3 T tokens using the xlstm-jax framework.
How to use it
First, install xlstm, which now uses the mlstm_kernels package for triton kernels:
pip install xlstm
pip install mlstm_kernels
For now, install the transformers repositiory fork from NX-AI (until it is merged):
pip install 'transformers @ git+ssh://[email protected]/NX-AI/transformers.git@integrate_xlstm'
Use this model as:
from transformers import AutoModelForCausalLM, AutoTokenizer
xlstm = AutoModelForCausalLM.from_pretrained("NX-AI/xLSTM-7b", device_map="auto")
# this is a fork of EleutherAI/gpt
tokenizers = AutoTokenizer.from_pretrained("NX-AI/xLSTM-7b")
xlstm(tokenizer("Hello xLSTM, how are you doing?"))
Speed results
Generation Speed using torch.cuda.graph and torch.compile optimizations:
Performance
Using HuggingFace's lm_eval:
| BBH | MMLU-Pro | Math | MUSR | GPQA | IfEval |
|---|---|---|---|---|---|
| 0.381 | 0.242 | 0.036 | 0.379 | 0.280 | 0.244 |
Using HuggingFace's lighteval in the Leaderboard-v1 settings:
| Arc-Challenge (25-shot) | MMLU (5-shot) | Hellaswag (10-shot) | Winogrande (5-shot) | TruthfulQA (0-shot) | GSM8k (5-shot) | OpenbookQA (5-shot) | PiQA (5-shot) |
|---|---|---|---|---|---|---|---|
| 0.584 | 0.589 | 0.710 | 0.742 | 0.420 | 0.004 | 0.443 | 0.817 |
License
NXAI Community License (see LICENSE file)