The awq quantization model may encounter garbled characters when performing inference on long texts.
9
#24 opened 9 days ago
by
wx111
Add instructions to run R1-AWQ on SGLang
2
#22 opened 15 days ago
by
ganler

requests get stuck when sending long prompts (already solved, but still don't know why?)
1
#18 opened 20 days ago
by
uv0xab
Is there any accuracy results comparing to original DeepSeek-R1?
2
#15 opened 21 days ago
by
traphix
Any one can run this model with SGlang framework?
3
#13 opened 21 days ago
by
muziyongshixin
Regarding the issue of inconsistent calculation of tokens
#12 opened 27 days ago
by
liguoyu3564
Max-Batch-Size, max-num-sequence, and fp_cache fp8_e4m3
#11 opened 28 days ago
by
BenFogerty
The inference performance of the DeepSeek-R1-AWQ model is weak compared to the DeepSeek-R1 model
8
#3 opened about 1 month ago
by
qingqingz916