FluentlyLM Prinum (32B-version)

Introducing the first standalone model from Project Fluently LM! We worked on it for several months, used different approaches, and eventually found the optimal one.

Model Details

Model Description

Developed by: @fluently-lm
Model type: Causal Language Models (QwenForCausalLM, LM Transformer)
Number of Parameters: 32.5B
Number of Paramaters (Non-Embedding): 31.0B
Number of Layers: 64
Number of Attention Heads (GQA): 40 for Q and 8 for KV
Context Length: Full 131,072 tokens
Language(s) (NLP): English, French, Spanish, Russian, Chinese, Japanese, Persian (official support)
License: MIT

Quickstart

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "fluently-lm/FluentlyLM-Prinum"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Write a quick sort algorithm."
messages = [
    {"role": "system", "content": "You are FluentlyLM, created by Project Fluently. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

GGUF-using

You can also use our model locally via GGUF file in various interfaces and workflows, we offer several repos for downloading GGUF:

mradermacher/FluentlyLM-Prinum-GGUF (all GGUF-quants)
fluently-lm/FluentlyLM-Prinum-Q4_K_M-GGUF (only Q4_K_M-quant) (coming soon...)

Model recipe

Evolution

🏆 12th place on Open LLM Leaderboard (21.02.2025)

Special thanks

🤗 We are grateful for open source resources, technologies and assistance from: Unsloth AI, Axolotl AI, Argilla, Alibaba Cloud: Qwen, NVIDIA and NousResearch.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	47.22
IFEval (0-Shot)	80.90
BBH (3-Shot)	59.48
MATH Lvl 5 (4-Shot)	54.00
GPQA (0-shot)	18.23
MuSR (0-shot)	17.26
MMLU-PRO (5-shot)	53.42

fluently-lm
/

FluentlyLM-Prinum

FluentlyLM Prinum (32B-version)

Model Details

Model Description

Quickstart

GGUF-using

Model recipe

Evolution

Special thanks

Open LLM Leaderboard Evaluation Results

Model tree for fluently-lm/FluentlyLM-Prinum

Datasets used to train fluently-lm/FluentlyLM-Prinum

Space using fluently-lm/FluentlyLM-Prinum 1

Collection including fluently-lm/FluentlyLM-Prinum

FluentlyLM Prinum

Evaluation results