Solshine
/

reflection-llama-3.1-8B

Text Generation

text-generation-inference

reflection-tuning

Inference Endpoints

Model card Files Files and versions Community

Solshine commited on 18 days ago

Commit

d13cd64

•

1 Parent(s): 579e0c6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ datasets:
 - **License:** llama 3.1
 - **Finetuned from model :** Solshine/reflection-llama-3.1-8B-Solshine-trainround4-16bit
-Inspired by and featuring the Reflection Tuning technique pioneered by Matt Shumer (possibly earlier innovated by the team at Anthropic, and Mlabbone' Hermes.)
 *To the authors' knowledge, this is V5 of the first "reflection tuned" Llama 3.1 8B LLM*

 - **License:** llama 3.1
 - **Finetuned from model :** Solshine/reflection-llama-3.1-8B-Solshine-trainround4-16bit
+This model, trained on chain of thoughts within the reinforcement learning, predates OpenAI's o1 model. Inspired by and featuring the Reflection Tuning technique pioneered by Matt Shumer (possibly earlier innovated by the team at Anthropic, and Mlabbone' Hermes.)
 *To the authors' knowledge, this is V5 of the first "reflection tuned" Llama 3.1 8B LLM*