Update README.md
Browse files
README.md
CHANGED
@@ -1,59 +1,54 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
-
- en
|
4 |
license: apache-2.0
|
5 |
tags:
|
6 |
-
-
|
7 |
-
|
8 |
-
-
|
|
|
|
|
9 |
metrics:
|
10 |
-
- accuracy
|
11 |
- f1
|
12 |
-
model-index:
|
13 |
-
- name: bert-base-uncased-mrpc
|
14 |
-
results:
|
15 |
-
- task:
|
16 |
-
name: Text Classification
|
17 |
-
type: text-classification
|
18 |
-
dataset:
|
19 |
-
name: GLUE MRPC
|
20 |
-
type: glue
|
21 |
-
args: mrpc
|
22 |
-
metrics:
|
23 |
-
- name: Accuracy
|
24 |
-
type: accuracy
|
25 |
-
value: 0.8602941176470589
|
26 |
-
- name: F1
|
27 |
-
type: f1
|
28 |
-
value: 0.9042016806722689
|
29 |
---
|
30 |
|
31 |
-
|
32 |
-
should probably proofread and complete it, then remove this comment. -->
|
33 |
|
34 |
-
|
35 |
|
36 |
-
This model
|
37 |
-
It achieves the following results on the evaluation set:
|
38 |
-
- Loss: 0.6978
|
39 |
-
- Accuracy: 0.8603
|
40 |
-
- F1: 0.9042
|
41 |
-
- Combined Score: 0.8822
|
42 |
|
43 |
-
|
44 |
|
45 |
The following hyperparameters were used during training:
|
46 |
- learning_rate: 2e-05
|
47 |
-
- train_batch_size: 16
|
48 |
-
- eval_batch_size: 8
|
49 |
-
- seed: 42
|
50 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
51 |
- lr_scheduler_type: linear
|
52 |
-
- num_epochs:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
|
54 |
-
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language: en
|
|
|
3 |
license: apache-2.0
|
4 |
tags:
|
5 |
+
- text-classfication
|
6 |
+
- int8
|
7 |
+
- QuantizationAwareTraining
|
8 |
+
datasets:
|
9 |
+
- mrpc
|
10 |
metrics:
|
|
|
11 |
- f1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
+
# INT8 BERT base uncased finetuned MRPC
|
|
|
15 |
|
16 |
+
### QuantizationAwareTraining
|
17 |
|
18 |
+
This is an INT8 PyTorch model quantized by [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor). The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc)
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
+
#### Training hyperparameters
|
21 |
|
22 |
The following hyperparameters were used during training:
|
23 |
- learning_rate: 2e-05
|
|
|
|
|
|
|
24 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
25 |
- lr_scheduler_type: linear
|
26 |
+
- num_epochs: 3.0
|
27 |
+
- train_batch_size: 8
|
28 |
+
- eval_batch_size: 8
|
29 |
+
- eval_steps: 100
|
30 |
+
- load_best_model_at_end: True
|
31 |
+
- metric_for_best_model: f1
|
32 |
+
- early_stopping_patience = 6
|
33 |
+
- early_stopping_threshold = 0.001
|
34 |
+
|
35 |
+
### Test result
|
36 |
+
|
37 |
+
- Batch size = 8
|
38 |
+
- [Amazon Web Services](https://aws.amazon.com/) c6i.xlarge (Intel ICE Lake: 4 vCPUs, 8g Memory) instance.
|
39 |
|
40 |
+
| |INT8|FP32|
|
41 |
+
|---|:---:|:---:|
|
42 |
+
| **Throughput (samples/sec)** |24.263|11.202|
|
43 |
+
| **Accuracy (eval-accuracy)** |0.9153|0.9042|
|
44 |
+
| **Model size (MB)** |174|418|
|
45 |
|
46 |
+
### Load with nlp-toolkit:
|
47 |
+
```python
|
48 |
+
from nlp_toolkit import OptimizedModel
|
49 |
+
int8_model = OptimizedModel.from_pretrained(
|
50 |
+
'Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
|
51 |
+
)
|
52 |
+
```
|
53 |
+
Notes:
|
54 |
+
- The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.
|