Update README.md
#207 opened about 16 hours ago
by
mehdi131
Update README.md
#206 opened 3 days ago
by
YUIHG

DeepSeek中o1-1217的数据是哪里来的。我好像没在OpenAI的官方途径找到,谢谢🙏
2
#205 opened 5 days ago
by
747860199qq
Any R1 reasoning researchers looking for samples?
#204 opened 5 days ago
by
natcolley
Update README.md
#203 opened 5 days ago
by
umar759
Request: DOI
#202 opened 6 days ago
by
Yenugu12
Create 9889555
#201 opened 6 days ago
by
keyi8
Upload 657f0f06e7ea1b09462a7a16_Feedback and evaluation-p-500.png
#200 opened 9 days ago
by
likhonsheikh

Best practice for R1 models evaluation: Reasoning efficiency and Performance by MATH-Level
#198 opened 13 days ago
by
wangxingjun778

DeepSeek R1 full-power version occasionally ends without returning </think>.
#196 opened 14 days ago
by
yizhiezi
deepseek满血版偶现结束没有返回 </think>
1
#195 opened 14 days ago
by
yizhiezi
Standing at a flag in Netherlands
#194 opened 15 days ago
by
Sweetstacg

Delete Config.json
#193 opened 18 days ago
by
jana0010
Update README.md
#192 opened 19 days ago
by
caraanchoa
为助手回答添加 <think>\n> 标签,确保一致性
#191 opened 19 days ago
by
REN0430
fix for transformers 4.49 compatibility
#189 opened 19 days ago
by
katuni4ka

Question about experts select
#186 opened 20 days ago
by
waynebian
Hardware Requirements to run the original model - 671B params
4
#185 opened 21 days ago
by
EdilCamil

Holding paper in hand
1
#184 opened 22 days ago
by
Loveyl
Update config.json
#182 opened 22 days ago
by
Empolean2640
Regression in Reasoning Tag Output - Missing <think> in Model Responses
1
#181 opened 22 days ago
by
divinerapier
Delete model.safetensors.index.json
#180 opened 22 days ago
by
Huggingfaceliaj
Unknown quantization type, got fp8
#179 opened 24 days ago
by
DenisFavaCerchiaro
如何取消/省略<think></think>过程。
3
#178 opened 24 days ago
by
yech520
Request: DOI
#177 opened 25 days ago
by
Tamwyn
Request: DOI
#176 opened 26 days ago
by
saathwik
Request: DOI
#175 opened 27 days ago
by
Paulabad
Draft model as accelerator for DeepSeek-R1?
4
#174 opened 27 days ago
by
inputout

Does R1 support long context (> 4K)?
#172 opened 28 days ago
by
ghostplant
Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)
8
#171 opened 28 days ago
by
samagra-tensorfuse
是否可以关注Perplexity推出的“r1-1776”模型?
4
#170 opened 28 days ago
by
yanyihan
Just crossed 10,000 likes!
1
#169 opened 29 days ago
by
clem

mac上面无法下载flash_attn
#168 opened 30 days ago
by
earlyIsLate
Can this model be used for commercial use?
2
#167 opened about 1 month ago
by
henrycwf

90+ tokens per second for MI300x8 using batch_size = 1
1
#166 opened about 1 month ago
by
ghostplant
"aha moment" comment deleted by Perplexity (recovered)
3
#164 opened about 1 month ago
by
FalconNet
'num_hidden_layers': 61, but layer 62 has weights.
#162 opened about 1 month ago
by
xinhe
Upload GTG Breaking every Limit
#161 opened about 1 month ago
by
GTGenesis
support prefix complete
3
#158 opened about 1 month ago
by
HuggineAllen
Create app.py
#157 opened about 1 month ago
by
SpaceAgeRobotics

Brokersponsor
#155 opened about 1 month ago
by
Brokersponsor

Update README.md
#154 opened about 1 month ago
by
egegvner
Upload IMG_4530.png
#152 opened about 1 month ago
by
Noemie202586
Upload IMG_1745.JPG
#151 opened about 1 month ago
by
Ladib
Create Clara
1
#150 opened about 1 month ago
by
Clblinks
If I understand correctly, evaluating MATH-500 requires 64*500 model calls?
1
#149 opened about 1 month ago
by
Rorschaaaach