Spaces:

open-r1
/

README

Running

App Files Files Community

README / README.md

eliebak HF staff

Update README.md

e9222ee verified 14 days ago

preview code

raw

history blame contribute delete

1.96 kB

	---
	title: README
	emoji: 📈
	colorFrom: yellow
	colorTo: purple
	sdk: static
	pinned: false
	---
	# Welcome to Open-R1 🐳🤗
	Open-R1 is an open initiative to replicate and extend the techniques behind DeepSeek-R1, a state-of-the-art reasoning model, in a fully transparent and collaborative way: https://github.com/huggingface/open-r1

	This organization is dedicated to:

	- Sharing datasets and models built on the path to replicating DeepSeek-R1.
	- Fostering meaningful discussions and collaboration in the [Community tab](https://huggingface.co/spaces/open-r1/README/discussions).

	By working together, we aim to create a robust foundation for reasoning models that the entire research and industry community can leverage.

	# Plan of attack
	We are using the DeepSeek-R1 tech report as a guide to recreate their pipeline. The work can be broken down into three main steps:

	- Replicate R1-Distill:
	Distill a high-quality reasoning corpus from DeepSeek-R1 to create the R1-Distill models.
	- Recreate the pure RL pipeline:
	Reproduce the reinforcement learning process that DeepSeek used to train R1-Zero. This will likely require curating new, large-scale datasets for math, reasoning, and code.
	- Demonstrate end-to-end training:
	Show that we can go from a base model to RL-tuned reasoning capabilities through a multi-stage training approach, combining supervised fine-tuning (SFT) and reinforcement learning (RL).

	# How to contribute
	This project thrives on community participation! Here are some ways you can contribute:

	- Join the discussion: Share ideas, ask questions, and collaborate with others in the [Community tab](https://huggingface.co/spaces/open-r1/README/discussions).
	- Contribute code or datasets: Submit pull requests with datasets, models, or improvements to the pipeline.
	- Experiment and share results: Try out different approaches and share your findings with the community.

	Let’s build something impactful together. 🚀