
RekaAI/reka-flash-3
Updated
ā¢
2.44k
ā¢
263
Then what if we do the same, but put whole conversation in first user input?
So it will be
System prompt
User: conversation history
Then ask R1 to generate thinking.
And the amount of messages in conversation history should be varied. Then, Bot reply will always contain thinking.
What do you think about doing part of the dataset with replies from some context?
E.g. we have e.g. 50% of data with thinking from first user answer, and some parts of dataset with
User,
Bot (no thinking),
User
Bot (no thinking),
User, N times,
Then ask R1 to think here and train on it. So the model will understand long context better.