File size: 1,898 Bytes
6b0d806 d70d94a 6b0d806 d70d94a 3241dc5 9a17416 d70d94a 6b0d806 d70d94a 67778f7 d70d94a 6627905 d70d94a 235263f d70d94a 235263f d70d94a 67778f7 d70d94a 235263f d70d94a 67778f7 d70d94a c07b4ad d70d94a 67778f7 8dc66f1 67778f7 8dc66f1 67778f7 c07b4ad |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
language: en
license: apache-2.0
tags:
- text-classfication
- int8
- Intel® Neural Compressor
- neural-compressor
- PostTrainingStatic
datasets:
- mrpc
metrics:
- f1
---
# INT8 BERT base uncased finetuned MRPC
## Post-training static quantization
### PyTorch
This is an INT8 PyTorch model quantized with [huggingface/optimum-intel](https://github.com/huggingface/optimum-intel) through the usage of [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
The calibration dataloader is the train dataloader. The calibration sampling size is 1000.
The linear module **bert.encoder.layer.9.output.dense** falls back to fp32 to meet the 1% relative accuracy loss.
#### Test result
| |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.8959|0.9042|
| **Model size (MB)** |119|418|
#### Load with Intel® Neural Compressor:
```python
from optimum.intel import INCModelForSequenceClassification
model_id = "Intel/bert-base-uncased-mrpc-int8-static"
int8_model = INCModelForSequenceClassification.from_pretrained(model_id)
```
### ONNX
This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
The calibration dataloader is the eval dataloader. The calibration sampling size is 100.
#### Test result
| |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.9021|0.9042|
| **Model size (MB)** |236|418|
#### Load ONNX model:
```python
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained('Intel/bert-base-uncased-mrpc-int8-static')
```
|