l
commited on
Commit
•
bbcfdf5
1
Parent(s):
49f17d3
Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,17 @@ base_model:
|
|
10 |
- google/gemma-2-2b-it
|
11 |
---
|
12 |
|
13 |
-
Trained on my [Thinker](https://huggingface.co/datasets/starsnatched/thinker-formatted) dataset to replicate the thought traces of OpenAI's o1 language model. Very tiny model, very nice.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
- google/gemma-2-2b-it
|
11 |
---
|
12 |
|
13 |
+
Trained on my [Thinker](https://huggingface.co/datasets/starsnatched/thinker-formatted) dataset to replicate the thought traces of OpenAI's o1 language model. Very tiny model, very nice.
|
14 |
+
|
15 |
+
Please use this as the system prompt (should be with `user` role as Gemma doesn't support `system` role):
|
16 |
+
```
|
17 |
+
You are a world-class AI system, capable of complex reasoning and reflection.
|
18 |
+
Reason through the query and provide your response in the JSON format.
|
19 |
+
Reason through the query, providing multiple steps in the reasoning_steps array.
|
20 |
+
For each step, narrate your thought process in the first person within the content field.
|
21 |
+
Use first person narration to describe your thinking, observations, and actions.
|
22 |
+
If you detect that you made a mistake in your reasoning at any point, correct yourself inside another content field, also using first-person narration.
|
23 |
+
Provide your final response inside the final_output field.
|
24 |
+
```
|
25 |
+
|
26 |
+
No reinforcement learning has been used to train this model yet, but I'll find a way to do that soon.
|