qihoo360
/

Light-R1-32B

@@ -1,13 +1,16 @@
 ---
-license: apache-2.0
 base_model:
 - Qwen/Qwen2.5-32B-Instruct
 ---
 # Light-R1: Surpassing R1-Distill from Scratch\* with \$1000 through Curriculum SFT & DPO
 *\*from models without long COT*
-[technical report](https://arxiv.org/abs/2503.10460)
 [GitHub page](https://github.com/Qihoo360/Light-R1)
@@ -128,4 +131,4 @@ Training data are collected from various public sources.
       archivePrefix={},
       url={https://github.com/Qihoo360/Light-R1},
 }
-```

 ---
 base_model:
 - Qwen/Qwen2.5-32B-Instruct
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
 # Light-R1: Surpassing R1-Distill from Scratch\* with \$1000 through Curriculum SFT & DPO
 *\*from models without long COT*
+[technical report](https://huggingface.co/papers/2503.10460)
 [GitHub page](https://github.com/Qihoo360/Light-R1)
       archivePrefix={},
       url={https://github.com/Qihoo360/Light-R1},
 }
+```