|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- ru |
|
base_model: |
|
- Qwen/Qwen2.5-7B-Instruct |
|
--- |
|
|
|
## Description |
|
|
|
*Eule* is my attempt at reproducing OpenAI's o1 series of reasoning models. At the moment the point is not to hit good scores on benchmarks (this model is rather stupid), |
|
but to introduce a qualitative change in how the LLM approaches tasks. Similar to o1 models, Eule approaches its proplems in a step-by-step manner. |
|
It is also trained to reason in Russian (ultimately, I want to make a decent russian reasoning model). |
|
|
|
Cool things I found while playing with it: |
|
1. It tries to verify its solutions to make sure they are correct. |
|
2. When failed, sometimes it tries to reiterate on the problem and try a new approach or fix the mistake. |
|
|
|
Bad things: |
|
1. It is stupid. Not smarter than the instruct it is based on (not a strict statement, I didn't run any benchmarks yet). Although it's interesting to inspect its chains of thought. |
|
2. The final response (after the reasoning chain) is in English. |
|
3. Sometimes the model may not produce <|REASONING_END|> which messes up parsing. |
|
|
|
Atm it is trained only on math data but it can solve riddles and other problems that require step-by-step reasoning. |
|
I'm planning on adding more non-math data and then proceed to RL. |
|
|
|
## Training Details |
|
|
|
It was trained using [kolibrify](https://github.com/oKatanaaa/kolibrify) on a single H800 for about 6 hours. |
|
Training data consists of math problems with solutions formatted as deliberate reasoning chains. The longest reasoning chain is ~19000 tokens. |
|
|
|
|
|
The model follows ChatML template, but introduces several new tokens: |
|
- `<|REASONING_START|>` - start of a reasoning chain. |
|
- `<|REASONING_END|>` - end of a reasoning chain. |
|
- `<|RSS|>` - start of a reasoning step. |
|
- `<|RSE|>` - end of a reasoning step. |
|
|
|
A typical conversation formatting structure is as follows: |
|
``` |
|
<|im_start|>system |
|
System message<|im_end|> |
|
<|im_start|>user |
|
Problem description<|im_end|> |
|
<|im_start|>assistant |
|
<|REASONING_START|><|RSS|>step 1<|RSE|><|RSS|>step 2<|RSE|><|REASONING_END|>Final assistant response<|im_end|> |
|
``` |
|
|
|
## How to Get Started with the Model |
|
|
|
I use [unsloth](https://github.com/unslothai/unsloth) and recommend you do the same: |
|
```python |
|
from unsloth import FastLanguageModel |
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name='kaleinaNyan/eule-qwen2.5instruct-7b-111224' |
|
) |
|
FastLanguageModel.for_inference(model) |
|
|
|
def generate(chat, n_tokens, use_cache=True, do_sample=False): |
|
input_str = tokenizer.apply_chat_template(chat, tokenize=False) + '<|im_start|>assistant\n<|REASONING_START|><|RSS|>' |
|
inputs = tokenizer(input_str, return_tensors='pt') |
|
outputs = model.generate(input_ids = inputs['input_ids'], max_new_tokens = n_tokens, use_cache = use_cache, do_sample=do_sample, temperature=0.7) |
|
return tokenizer.batch_decode(outputs)[0] |
|
|
|
msg = "Come up with a qubic equation and solve it" |
|
system_message = "You are an AI assistant that thoroughly solves any task. Explore various routes and verify your solutions. Reason in Russian. Provide concise responses to the user." |
|
|
|
chat = [ |
|
{'role': 'system', 'content': system_message}, |
|
{'role': 'user', 'content': msg}, |
|
] |
|
response = generate(chat, 8196, do_sample=True) |
|
print('\n'.join(response.split('<|REASONING_START|>')[-1].split('<|RSS|>'))) |
|
``` |
|
|
|
## Evaluation |
|
|
|
I'll provide it later for MATH and GSM8K benchmarks. |