cicdatopea commited on
Commit
cfbb6b2
·
verified ·
1 Parent(s): 22466da
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -7,13 +7,14 @@ base_model:
7
 
8
 
9
 
 
10
  ---
11
 
12
  ## Model Details
13
 
14
- This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Some layers are fallback to 4/16 bits. Refer to Section "Generate the model" for more details of mixed bits setting.
15
 
16
- Please follow the license of the original model. This model could **NOT** run on other severing frameworks.
17
 
18
  ## How To Use
19
 
@@ -240,7 +241,8 @@ model = AutoModelForCausalLM.from_pretrained(
240
  torch_dtype=torch.bfloat16,
241
  trust_remote_code=True,
242
  device_map="cpu",
243
- quantization_config=quantization_config
 
244
  )
245
 
246
 
@@ -416,7 +418,6 @@ model = AutoModelForCausalLM.from_pretrained(
416
  torch_dtype=torch.float16,
417
  trust_remote_code=True,
418
  device_map=device_map,
419
- revision="080ef2d"
420
  )
421
  tokenizer = AutoTokenizer.from_pretrained(model_name)
422
 
@@ -504,7 +505,6 @@ from auto_round import AutoRound
504
 
505
 
506
 
507
-
508
  autoround = AutoRound(model=model, tokenizer=tokenizer, device_map=device_map, bits=2, group_size=64,
509
  iters=1000, batch_size=4, seqlen=512, nsamples=512, enable_torch_compile=False,
510
  )
 
7
 
8
 
9
 
10
+
11
  ---
12
 
13
  ## Model Details
14
 
15
+ This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. We recommend using a mixed version [OPEA/DeepSeek-R1-int2-mixed-sym-inc](https://huggingface.co/OPEA/DeepSeek-R1-int2-mixed-sym-inc) for better accuracy
16
 
17
+ Please follow the license of the original model.
18
 
19
  ## How To Use
20
 
 
241
  torch_dtype=torch.bfloat16,
242
  trust_remote_code=True,
243
  device_map="cpu",
244
+ quantization_config=quantization_config,
245
+ revision="080ef2d"
246
  )
247
 
248
 
 
418
  torch_dtype=torch.float16,
419
  trust_remote_code=True,
420
  device_map=device_map,
 
421
  )
422
  tokenizer = AutoTokenizer.from_pretrained(model_name)
423
 
 
505
 
506
 
507
 
 
508
  autoround = AutoRound(model=model, tokenizer=tokenizer, device_map=device_map, bits=2, group_size=64,
509
  iters=1000, batch_size=4, seqlen=512, nsamples=512, enable_torch_compile=False,
510
  )