Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ base_model:
|
|
12 |
|
13 |
This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Some layers are fallback to 4/16 bits. Refer to Section "Generate the model" for more details of mixed bits setting.
|
14 |
|
15 |
-
Please follow the license of the original model. This model could **NOT** run on other severing
|
16 |
|
17 |
## How To Use
|
18 |
|
@@ -20,7 +20,7 @@ Please follow the license of the original model. This model could **NOT** run on
|
|
20 |
|
21 |
please note int2 **may be slower** than int4 on CUDA due to kernel issue.
|
22 |
|
23 |
-
**To prevent potential overflow, we recommend using the CPU version detailed in the next section.**
|
24 |
|
25 |
~~~python
|
26 |
import transformers
|
|
|
12 |
|
13 |
This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Some layers are fallback to 4/16 bits. Refer to Section "Generate the model" for more details of mixed bits setting.
|
14 |
|
15 |
+
Please follow the license of the original model. This model could **NOT** run on other severing frameworks.
|
16 |
|
17 |
## How To Use
|
18 |
|
|
|
20 |
|
21 |
please note int2 **may be slower** than int4 on CUDA due to kernel issue.
|
22 |
|
23 |
+
**To prevent potential overflow and achieve better accuracy, we recommend using the CPU version detailed in the next section.**
|
24 |
|
25 |
~~~python
|
26 |
import transformers
|