xinhe commited on
Commit
235263f
1 Parent(s): 9c3a30b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -20,16 +20,16 @@ This is an INT8 PyTorch model quantized with [Intel® Neural Compressor](https:
20
 
21
  The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
22
 
23
- The calibration dataloader is the train dataloader. The default calibration sampling size 300 isn't divisible exactly by batch size 8, so the real sampling size is 304.
24
 
25
- The linear module **bert.encoder.layer.9.output.dense, bert.encoder.layer.10.output.dense** falls back to fp32 to meet the 1% relative accuracy loss.
26
 
27
  ### Test result
28
 
29
  | |INT8|FP32|
30
  |---|:---:|:---:|
31
- | **Accuracy (eval-f1)** |0.8997|0.9042|
32
- | **Model size (MB)** |120|418|
33
 
34
  ### Load with Intel® Neural Compressor:
35
 
 
20
 
21
  The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
22
 
23
+ The calibration dataloader is the train dataloader. The calibration sampling size is 1000.
24
 
25
+ The linear module **bert.encoder.layer.9.output.dense** falls back to fp32 to meet the 1% relative accuracy loss.
26
 
27
  ### Test result
28
 
29
  | |INT8|FP32|
30
  |---|:---:|:---:|
31
+ | **Accuracy (eval-f1)** |0.8959|0.9042|
32
+ | **Model size (MB)** |119|418|
33
 
34
  ### Load with Intel® Neural Compressor:
35