deepseek-ai/DeepSeek-R1

Update README.md

#207 opened about 16 hours ago by

mehdi131

Update README.md

#206 opened 3 days ago by

YUIHG

DeepSeek中o1-1217的数据是哪里来的。我好像没在OpenAI的官方途径找到，谢谢🙏

2

#205 opened 5 days ago by

747860199qq

Any R1 reasoning researchers looking for samples?

#204 opened 5 days ago by

natcolley

Update README.md

#203 opened 5 days ago by

umar759

Request: DOI

#202 opened 6 days ago by

Yenugu12

Create 9889555

#201 opened 6 days ago by

keyi8

Upload 657f0f06e7ea1b09462a7a16_Feedback and evaluation-p-500.png

#200 opened 9 days ago by

likhonsheikh

Best practice for R1 models evaluation: Reasoning efficiency and Performance by MATH-Level

#198 opened 13 days ago by

wangxingjun778

DeepSeek R1 full-power version occasionally ends without returning </think>.

#196 opened 14 days ago by

yizhiezi

deepseek满血版偶现结束没有返回 </think>

1

#195 opened 14 days ago by

yizhiezi

Standing at a flag in Netherlands

#194 opened 15 days ago by

Sweetstacg

Delete Config.json

#193 opened 18 days ago by

jana0010

Update README.md

#192 opened 19 days ago by

caraanchoa

为助手回答添加 <think>\n> 标签，确保一致性

#191 opened 19 days ago by

REN0430

fix for transformers 4.49 compatibility

#189 opened 19 days ago by

katuni4ka

MLLM交流群

#188 opened 20 days ago by

YLHX

Question about experts select

#186 opened 20 days ago by

waynebian

Hardware Requirements to run the original model - 671B params

4

#185 opened 21 days ago by

EdilCamil

Holding paper in hand

1

#184 opened 22 days ago by

Loveyl

Update config.json

#182 opened 22 days ago by

Empolean2640

Regression in Reasoning Tag Output - Missing <think> in Model Responses

1

#181 opened 22 days ago by

divinerapier

Delete model.safetensors.index.json

#180 opened 22 days ago by

Huggingfaceliaj

Unknown quantization type, got fp8

#179 opened 24 days ago by

DenisFavaCerchiaro

如何取消/省略<think></think>过程。

3

#178 opened 24 days ago by

yech520

Request: DOI

#177 opened 25 days ago by

Tamwyn

Request: DOI

#176 opened 26 days ago by

saathwik

Request: DOI

#175 opened 27 days ago by

Paulabad

Draft model as accelerator for DeepSeek-R1?

4

#174 opened 27 days ago by

inputout

Does R1 support long context (> 4K)?

#172 opened 28 days ago by

ghostplant

Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)

8

#171 opened 28 days ago by

samagra-tensorfuse

是否可以关注Perplexity推出的“r1-1776”模型？

4

#170 opened 28 days ago by

yanyihan

Just crossed 10,000 likes!

1

#169 opened 29 days ago by

clem

mac上面无法下载flash_attn

#168 opened 30 days ago by

earlyIsLate

Can this model be used for commercial use?

2

#167 opened about 1 month ago by

henrycwf

90+ tokens per second for MI300x8 using batch_size = 1

1

#166 opened about 1 month ago by

ghostplant

RytryR1

#165 opened about 1 month ago by

Rocka01

"aha moment" comment deleted by Perplexity (recovered)

3

#164 opened about 1 month ago by

FalconNet

输出乱码

1

#163 opened about 1 month ago by

cell22

'num_hidden_layers': 61, but layer 62 has weights.

#162 opened about 1 month ago by

xinhe

Upload GTG Breaking every Limit

#161 opened about 1 month ago by

GTGenesis

support prefix complete

3

#158 opened about 1 month ago by

HuggineAllen

Create app.py

#157 opened about 1 month ago by

SpaceAgeRobotics

Create 1

#156 opened about 1 month ago by

madevii

Brokersponsor

#155 opened about 1 month ago by

Brokersponsor

Update README.md

#154 opened about 1 month ago by

egegvner

Upload IMG_4530.png

#152 opened about 1 month ago by

Noemie202586

Upload IMG_1745.JPG

#151 opened about 1 month ago by

Ladib

Create Clara

1

#150 opened about 1 month ago by

Clblinks

If I understand correctly, evaluating MATH-500 requires 64*500 model calls?

1

#149 opened about 1 month ago by

Rorschaaaach

Update README.md

Update README.md

DeepSeek中o1-1217的数据是哪里来的。我好像没在OpenAI的官方途径找到，谢谢🙏

Any R1 reasoning researchers looking for samples?

Update README.md

Request: DOI

Create 9889555

Upload 657f0f06e7ea1b09462a7a16_Feedback and evaluation-p-500.png

Best practice for R1 models evaluation: Reasoning efficiency and Performance by MATH-Level

DeepSeek R1 full-power version occasionally ends without returning </think>.

deepseek满血版偶现结束没有返回 </think>

Standing at a flag in Netherlands

Delete Config.json

Update README.md

为助手回答添加 <think>\n> 标签，确保一致性

fix for transformers 4.49 compatibility

MLLM交流群

Question about experts select

Hardware Requirements to run the original model - 671B params

Holding paper in hand

Update config.json

Regression in Reasoning Tag Output - Missing <think> in Model Responses

Delete model.safetensors.index.json

Unknown quantization type, got fp8

如何取消/省略<think></think>过程。

Request: DOI

Request: DOI

Request: DOI

Draft model as accelerator for DeepSeek-R1?

Does R1 support long context (> 4K)?

Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)

是否可以关注Perplexity推出的“r1-1776”模型？

Just crossed 10,000 likes!

mac上面无法下载flash_attn

Can this model be used for commercial use?

90+ tokens per second for MI300x8 using batch_size = 1

RytryR1

"aha moment" comment deleted by Perplexity (recovered)

输出乱码

'num_hidden_​​layers': 61, but layer 62 has weights.

Upload GTG Breaking every Limit

support prefix complete

Create app.py

Create 1

Brokersponsor

Update README.md

Upload IMG_4530.png

Upload IMG_1745.JPG

Create Clara

If I understand correctly, evaluating MATH-500 requires 64*500 model calls?

'num_hidden_layers': 61, but layer 62 has weights.