Instructions to use squ11z1/Mythos-nano with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use squ11z1/Mythos-nano with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="squ11z1/Mythos-nano", filename="mythos-nano-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use squ11z1/Mythos-nano with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf squ11z1/Mythos-nano:Q4_K_M # Run inference directly in the terminal: llama-cli -hf squ11z1/Mythos-nano:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf squ11z1/Mythos-nano:Q4_K_M # Run inference directly in the terminal: llama-cli -hf squ11z1/Mythos-nano:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf squ11z1/Mythos-nano:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf squ11z1/Mythos-nano:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf squ11z1/Mythos-nano:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf squ11z1/Mythos-nano:Q4_K_M
Use Docker
docker model run hf.co/squ11z1/Mythos-nano:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use squ11z1/Mythos-nano with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "squ11z1/Mythos-nano" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "squ11z1/Mythos-nano", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/squ11z1/Mythos-nano:Q4_K_M
- Ollama
How to use squ11z1/Mythos-nano with Ollama:
ollama run hf.co/squ11z1/Mythos-nano:Q4_K_M
- Unsloth Studio
How to use squ11z1/Mythos-nano with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for squ11z1/Mythos-nano to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for squ11z1/Mythos-nano to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for squ11z1/Mythos-nano to start chatting
- Pi
How to use squ11z1/Mythos-nano with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf squ11z1/Mythos-nano:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "squ11z1/Mythos-nano:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use squ11z1/Mythos-nano with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf squ11z1/Mythos-nano:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default squ11z1/Mythos-nano:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use squ11z1/Mythos-nano with Docker Model Runner:
docker model run hf.co/squ11z1/Mythos-nano:Q4_K_M
- Lemonade
How to use squ11z1/Mythos-nano with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull squ11z1/Mythos-nano:Q4_K_M
Run and chat with the model
lemonade run user.Mythos-nano-Q4_K_M
List all available models
lemonade list
Disclaimer: This is not an official release by Anthropic.
Mythos-nano is an independent open model project.
Mythos-nano
🏆 Benchmarks
Mythos-nano (3B) vs. frontier models. +CLR = with test-time CLR boost.
| Benchmark | Mythos-nano | +CLR | Qwen3.6 Plus | Gemini 3 Pro | GLM-5 | Kimi K2.5 | Claude Opus 4.5 |
|---|---|---|---|---|---|---|---|
| AIME'25 | 91.4 | 96.7 | 93.3 | 96.0 | 96.7 | 96.1 | 92.8 |
| AIME'26 | 94.3 | 97.1 | 95.3 | 91.7 | 95.8 | 93.3 | 95.1 |
| HMMT'25 | 89.3 | 95.4 | 96.7 | 97.5 | 97.9 | 95.4 | 92.9 |
| IMO-AnswerBench | 76.4 | 80.6 | 83.8 | 83.1 | 82.5 | 81.8 | 78.5 |
| LiveCodeBench v6 | 80.2 | — | 87.1 | 87.4 | 85.5 | 85.0 | 84.8 |
| IFBench | 74.5 | — | 74.2 | 70.4 | 76.5 | 70.0 | 58.0 |
Full comparison (mathematics · coding · knowledge · instruction)
| Model | Params | AIME25 | AIME26 | HMMT25 | BruMO25 | IMO-Ans | LCBv6 | OJBench | GPQA-D | IFEval | IFBench |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Kimi K2.5 | 1T | 96.1 | 93.3 | 95.4 | 98.3 | 81.8 | 85.0 | 54.7 | 87.6 | 93.9 | 70.0 |
| GLM-5 | 744B | 96.7 | 95.8 | 97.9 | – | 82.5 | 85.5 | 55.0 | 86.0 | 92.6 | 76.5 |
| DeepSeek V3.2 | 671B | 93.1 | 94.2 | 90.2 | 96.7 | 78.3 | 80.8 | 48.4 | 82.4 | 92.6 | 60.7 |
| Gemini 3 Pro | N/A | 96.0 | 91.7 | 97.5 | 98.3 | 83.1 | 87.4 | 58.8 | 91.9 | – | 70.4 |
| Claude Opus 4.5 | N/A | 92.8 | 95.1 | 92.9 | – | 78.5 | 84.8 | – | 87.0 | – | 58.0 |
| GPT-5 (high) | N/A | 94.6 | – | 88.3 | 91.7 | 76.0 | 84.5 | – | 85.7 | – | 73.1 |
| Mythos-nano | 3B | 91.4 | 94.3 | 89.3 | 93.8 | 76.4 | 80.2 | 38.6 | 70.2 | 93.4 | 74.5 |
| Mythos-nano + CLR | 3B | 96.7 | 97.1 | 95.4 | 99.2 | 80.6 | – | – | 72.9 | – | – |
LeetCode contests (Python, pass-rate)
| Model | Aggregate |
|---|---|
| GPT-5.3-Codex | 100.0% (128/128) |
| Gemini 3.1 Pro | 99.2% (127/128) |
| Gemini 3 Flash | 96.9% (124/128) |
| Mythos-nano | 96.1% (123/128) |
| GPT-5.2 | 95.3% (122/128) |
| Qwen3-Max | 91.4% (117/128) |
| Kimi K2.5 | 90.6% (116/128) |
| Claude Opus 4.6 | 86.7% (111/128) |
A 3B model placing within ~4 points of trillion-parameter systems on competition math and live code — the core thesis: with verifiable feedback, small models reach frontier reasoning.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("squ11z1/Mythos-nano")
model = AutoModelForCausalLM.from_pretrained("squ11z1/Mythos-nano", dtype=torch.bfloat16, device_map="cuda")
msgs = [{"role": "user", "content": "Find all integer solutions of x^2 - y^2 = 12."}]
ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to("cuda")
print(tok.decode(model.generate(ids, max_new_tokens=2048, temperature=0.6)[0], skip_special_tokens=True))
Recommended sampling: temperature 0.6–1.0, up to 40960 output tokens for hard problems.
GGUF
mythos-nano-f16.gguf and mythos-nano-Q4_K_M.gguf are provided for llama.cpp / Ollama.
License
MIT.
- Downloads last month
- 14
