OPEA
/

DeepSeek-R1-int2-mixed-sym-inc

intel/auto-round

Model card Files Files and versions Community

cicdatopea commited on 12 days ago

Commit

e691dc8

·

verified ·

1 Parent(s): 2806a46

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ base_model:
 This model is an int2 model with group_size  64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.  Some layers are fallback to 4/16 bits. Refer to  Section "Generate the model" for more details of mixed bits setting.
-Please follow the license of the original model. This model could **NOT** run on other severing framework.
 ## How To Use
@@ -20,7 +20,7 @@ Please follow the license of the original model. This model could **NOT** run on
 please note int2 **may be slower** than int4 on CUDA due to kernel issue.
-**To prevent potential overflow, we recommend using the CPU version detailed in the next section.**
 ~~~python
 import transformers

 This model is an int2 model with group_size  64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.  Some layers are fallback to 4/16 bits. Refer to  Section "Generate the model" for more details of mixed bits setting.
+Please follow the license of the original model. This model could **NOT** run on other severing frameworks.
 ## How To Use
 please note int2 **may be slower** than int4 on CUDA due to kernel issue.
+**To prevent potential overflow and achieve better accuracy, we recommend using the CPU version detailed in the next section.**
 ~~~python
 import transformers