Disclaimer: This is not an official release by Anthropic.
Mythos-nano is an independent open model project.

Mythos-nano

Gemini_Generated_Image_1nl8n11nl8n11nl8

🏆 Benchmarks

Mythos-nano (3B) vs. frontier models. +CLR = with test-time CLR boost.

Benchmark Mythos-nano +CLR Qwen3.6 Plus Gemini 3 Pro GLM-5 Kimi K2.5 Claude Opus 4.5
AIME'25 91.4 96.7 93.3 96.0 96.7 96.1 92.8
AIME'26 94.3 97.1 95.3 91.7 95.8 93.3 95.1
HMMT'25 89.3 95.4 96.7 97.5 97.9 95.4 92.9
IMO-AnswerBench 76.4 80.6 83.8 83.1 82.5 81.8 78.5
LiveCodeBench v6 80.2 87.1 87.4 85.5 85.0 84.8
IFBench 74.5 74.2 70.4 76.5 70.0 58.0

Full comparison (mathematics · coding · knowledge · instruction)

Model Params AIME25 AIME26 HMMT25 BruMO25 IMO-Ans LCBv6 OJBench GPQA-D IFEval IFBench
Kimi K2.5 1T 96.1 93.3 95.4 98.3 81.8 85.0 54.7 87.6 93.9 70.0
GLM-5 744B 96.7 95.8 97.9 82.5 85.5 55.0 86.0 92.6 76.5
DeepSeek V3.2 671B 93.1 94.2 90.2 96.7 78.3 80.8 48.4 82.4 92.6 60.7
Gemini 3 Pro N/A 96.0 91.7 97.5 98.3 83.1 87.4 58.8 91.9 70.4
Claude Opus 4.5 N/A 92.8 95.1 92.9 78.5 84.8 87.0 58.0
GPT-5 (high) N/A 94.6 88.3 91.7 76.0 84.5 85.7 73.1
Mythos-nano 3B 91.4 94.3 89.3 93.8 76.4 80.2 38.6 70.2 93.4 74.5
Mythos-nano + CLR 3B 96.7 97.1 95.4 99.2 80.6 72.9

LeetCode contests (Python, pass-rate)

Model Aggregate
GPT-5.3-Codex 100.0% (128/128)
Gemini 3.1 Pro 99.2% (127/128)
Gemini 3 Flash 96.9% (124/128)
Mythos-nano 96.1% (123/128)
GPT-5.2 95.3% (122/128)
Qwen3-Max 91.4% (117/128)
Kimi K2.5 90.6% (116/128)
Claude Opus 4.6 86.7% (111/128)

A 3B model placing within ~4 points of trillion-parameter systems on competition math and live code — the core thesis: with verifiable feedback, small models reach frontier reasoning.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("squ11z1/Mythos-nano")
model = AutoModelForCausalLM.from_pretrained("squ11z1/Mythos-nano", dtype=torch.bfloat16, device_map="cuda")
msgs = [{"role": "user", "content": "Find all integer solutions of x^2 - y^2 = 12."}]
ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to("cuda")
print(tok.decode(model.generate(ids, max_new_tokens=2048, temperature=0.6)[0], skip_special_tokens=True))

Recommended sampling: temperature 0.6–1.0, up to 40960 output tokens for hard problems.

GGUF

mythos-nano-f16.gguf and mythos-nano-Q4_K_M.gguf are provided for llama.cpp / Ollama.

License

MIT.

Downloads last month
14
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for squ11z1/Mythos-nano

Base model

Qwen/Qwen2.5-3B
Finetuned
(4)
this model

Collection including squ11z1/Mythos-nano