--- base_model: LGAI-EXAONE/EXAONE-3.5-32B-Instruct base_model_relation: finetune license: other license_name: exaone license_link: LICENSE language: - en - ko tags: - lg-ai - exaone - exaone-deep pipeline_tag: text-generation library_name: transformers ---
# EXAONE-Deep-32B
## Introduction
We introduce EXAONE Deep, which exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research. Evaluation results show that 1) EXAONE Deep **2.4B** outperforms other models of comparable size, 2) EXAONE Deep **7.8B** outperforms not only open-weight models of comparable scale but also a proprietary reasoning model OpenAI o1-mini, and 3) EXAONE Deep **32B** demonstrates competitive performance against leading open-weight models.
For more details, please refer to our [documentation](https://arxiv.org/abs/2503.12524), [blog](https://www.lgresearch.ai/news/view?seq=543) and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-Deep).
This repository contains the reasoning 32B language model with the following features:
- Number of Parameters (without embeddings): 30.95B
- Number of Layers: 64
- Number of Attention Heads: GQA with 40 Q-heads and 8 KV-heads
- Vocab Size: 102,400
- Context Length: 32,768 tokens
## Quickstart
We recommend to use `transformers` v4.43.1 or later.
Here is the code snippet to run conversational inference with the model:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
from threading import Thread
model_name = "LGAI-EXAONE/EXAONE-Deep-32B"
streaming = True # choose the streaming option
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Choose your prompt:
# Math example (AIME 2024)
prompt = r"""Let $x,y$ and $z$ be positive real numbers that satisfy the following system of equations:
\[\log_2\left({x \over yz}\right) = {1 \over 2}\]\[\log_2\left({y \over xz}\right) = {1 \over 3}\]\[\log_2\left({z \over xy}\right) = {1 \over 4}\]
Then the value of $\left|\log_2(x^4y^3z^2)\right|$ is $\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$.
Please reason step by step, and put your final answer within \boxed{}."""
# Korean MCQA example (CSAT Math 2025)
prompt = r"""Question : $a_1 = 2$인 수열 $\{a_n\}$과 $b_1 = 2$인 등차수열 $\{b_n\}$이 모든 자연수 $n$에 대하여\[\sum_{k=1}^{n} \frac{a_k}{b_{k+1}} = \frac{1}{2} n^2\]을 만족시킬 때, $\sum_{k=1}^{5} a_k$의 값을 구하여라.
Options :
A) 120
B) 125
C) 130
D) 135
E) 140
Please reason step by step, and you should write the correct option alphabet (A, B, C, D or E) within \\boxed{}."""
messages = [
{"role": "user", "content": prompt}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
)
if streaming:
streamer = TextIteratorStreamer(tokenizer)
thread = Thread(target=model.generate, kwargs=dict(
input_ids=input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=32768,
do_sample=True,
temperature=0.6,
top_p=0.95,
streamer=streamer
))
thread.start()
for text in streamer:
print(text, end="", flush=True)
else:
output = model.generate(
input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=32768,
do_sample=True,
temperature=0.6,
top_p=0.95,
)
print(tokenizer.decode(output[0]))
```
> ### Note
> The EXAONE Deep models are trained with an optimized configuration,
> so we recommend following the [Usage Guideline](#usage-guideline) section to achieve optimal performance.
## Evaluation
The following table shows the evaluation results of reasoning tasks such as math and coding. The full evaluation results can be found in the [documentation](https://arxiv.org/abs/2503.12524).
Models | MATH-500 (pass@1) | AIME 2024 (pass@1 / cons@64) | AIME 2025 (pass@1 / cons@64) | CSAT Math 2025 (pass@1) | GPQA Diamond (pass@1) | Live Code Bench (pass@1) |
---|---|---|---|---|---|---|
EXAONE Deep 32B | 95.7 | 72.1 / 90.0 | 65.8 / 80.0 | 94.5 | 66.1 | 59.5 |
DeepSeek-R1-Distill-Qwen-32B | 94.3 | 72.6 / 83.3 | 55.2 / 73.3 | 84.1 | 62.1 | 57.2 |
QwQ-32B | 95.5 | 79.5 / 86.7 | 67.1 / 76.7 | 94.4 | 63.3 | 63.4 |
DeepSeek-R1-Distill-Llama-70B | 94.5 | 70.0 / 86.7 | 53.9 / 66.7 | 88.8 | 65.2 | 57.5 |
DeepSeek-R1 (671B) | 97.3 | 79.8 / 86.7 | 66.8 / 80.0 | 89.9 | 71.5 | 65.9 |
EXAONE Deep 7.8B | 94.8 | 70.0 / 83.3 | 59.6 / 76.7 | 89.9 | 62.6 | 55.2 |
DeepSeek-R1-Distill-Qwen-7B | 92.8 | 55.5 / 83.3 | 38.5 / 56.7 | 79.7 | 49.1 | 37.6 |
DeepSeek-R1-Distill-Llama-8B | 89.1 | 50.4 / 80.0 | 33.6 / 53.3 | 74.1 | 49.0 | 39.6 |
OpenAI o1-mini | 90.0 | 63.6 / 80.0 | 54.8 / 66.7 | 84.4 | 60.0 | 53.8 |
EXAONE Deep 2.4B | 92.3 | 52.5 / 76.7 | 47.9 / 73.3 | 79.2 | 54.3 | 46.6 |
DeepSeek-R1-Distill-Qwen-1.5B | 83.9 | 28.9 / 52.7 | 23.9 / 36.7 | 65.6 | 33.8 | 16.9 |