eliebak HF staff commited on
Commit
fa4b7ee
·
verified ·
1 Parent(s): b177a4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -1
README.md CHANGED
@@ -6,5 +6,26 @@ colorTo: purple
6
  sdk: static
7
  pinned: false
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
6
  sdk: static
7
  pinned: false
8
  ---
9
+ # Welcome to Open-R1 🐳🤗
10
+ Open-R1 is an open initiative to replicate and extend the techniques behind DeepSeek-R1, a state-of-the-art reasoning model, in a fully transparent and collaborative way. This organization is dedicated to:
11
+ - Sharing datasets and models built on the path to replicating DeepSeek-R1.
12
+ - Fostering meaningful discussions and collaboration in the Discussion tab.
13
+ By working together, we aim to create a robust foundation for reasoning models that the entire research and industry community can leverage.
14
+
15
+ # Plan of attack
16
+ We are using the DeepSeek-R1 tech report as a guide to recreate their pipeline. The work can be broken down into three main steps:
17
+
18
+ - Replicate R1-Distill:
19
+ Distill a high-quality reasoning corpus from DeepSeek-R1 to create the R1-Distill models.
20
+ - Recreate the pure RL pipeline:
21
+ Reproduce the reinforcement learning process that DeepSeek used to train R1-Zero. This will likely require curating new, large-scale datasets for math, reasoning, and code.
22
+ - Demonstrate end-to-end training:
23
+ Show that we can go from a base model to RL-tuned reasoning capabilities through a multi-stage training approach, combining supervised fine-tuning (SFT) and reinforcement learning (RL).
24
+
25
+ # How to contribute
26
+ This project thrives on community participation! Here are some ways you can contribute:
27
+ - Join the Discussion: Share ideas, ask questions, and collaborate with others in the Discussion tab.
28
+ - Contribute Code or Datasets: Submit pull requests with datasets, models, or improvements to the pipeline.
29
+ - Experiment and Share Results: Try out different approaches and share your findings with the community.
30
+ Let’s build something impactful together. 🚀
31