high-quality Chinese training datasets
Collection
a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or reinforcement learning.
•
9 items
•
Updated
•
3
opencsg/csg-wukong-2b-chinese-fineweb-edu
as base model, we fine-tune it on smoltalk-chinese
for 2 epoch