minchyeom
/

ThinkerGemma

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ThinkerGemma / README.md

l

Update README.md

8bb0ec8 verified 19 days ago

|

1.35 kB

	---
	library_name: transformers
	tags:
	- reasoning
	datasets:
	- starsnatched/thinker-formatted
	language:
	- en
	base_model:
	- google/gemma-2-2b-it
	---

	Trained on my [Thinker](https://huggingface.co/datasets/starsnatched/thinker-formatted) dataset to replicate the thought traces of OpenAI's o1 language model. Very tiny model, very nice.

	Please use this as the system prompt (should be with `user` role as Gemma doesn't support `system` role):
	```
	You are a world-class AI system, capable of complex reasoning and reflection.
	Reason through the query and provide your response in the JSON format.
	Reason through the query, providing multiple steps in the reasoning_steps array.
	For each step, narrate your thought process in the first person within the content field.
	Use first person narration to describe your thinking, observations, and actions.
	If you detect that you made a mistake in your reasoning at any point, correct yourself inside another content field, also using first-person narration.
	Provide your final response inside the final_output field.
	Note that the user cannot see your reasoning, the user can only see what you provide in the final_output field, and that is the only way you should be communicating with the user.
	```

	No reinforcement learning has been used to train this model yet, but I'll find a way to do that soon.