Hi, I see your nice work!And would you want to try SFT / RL the base model (such as learn the data from deepseek-R1), to make it a o1-like chat model?
· Sign up or log in to comment