Update README.md
Browse files
README.md
CHANGED
@@ -19,21 +19,6 @@ This is an INT8 PyTorch model quantized by [intel/nlp-toolkit](https://github.c
|
|
19 |
|
20 |
The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
|
21 |
|
22 |
-
#### Training hyperparameters
|
23 |
-
|
24 |
-
The following hyperparameters were used during training:
|
25 |
-
- learning_rate: 2e-05
|
26 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
27 |
-
- lr_scheduler_type: linear
|
28 |
-
- num_epochs: 3.0
|
29 |
-
- train_batch_size: 8
|
30 |
-
- eval_batch_size: 8
|
31 |
-
- eval_steps: 100
|
32 |
-
- load_best_model_at_end: True
|
33 |
-
- metric_for_best_model: f1
|
34 |
-
- early_stopping_patience = 6
|
35 |
-
- early_stopping_threshold = 0.001
|
36 |
-
|
37 |
### Test result
|
38 |
|
39 |
- Batch size = 8
|
@@ -42,7 +27,7 @@ The following hyperparameters were used during training:
|
|
42 |
| |INT8|FP32|
|
43 |
|---|:---:|:---:|
|
44 |
| **Throughput (samples/sec)** |24.263|11.202|
|
45 |
-
| **Accuracy (eval-
|
46 |
| **Model size (MB)** |174|418|
|
47 |
|
48 |
### Load with nlp-toolkit:
|
@@ -56,3 +41,18 @@ int8_model = OptimizedModel.from_pretrained(
|
|
56 |
|
57 |
Notes:
|
58 |
- The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
### Test result
|
23 |
|
24 |
- Batch size = 8
|
|
|
27 |
| |INT8|FP32|
|
28 |
|---|:---:|:---:|
|
29 |
| **Throughput (samples/sec)** |24.263|11.202|
|
30 |
+
| **Accuracy (eval-f1)** |0.9153|0.9042|
|
31 |
| **Model size (MB)** |174|418|
|
32 |
|
33 |
### Load with nlp-toolkit:
|
|
|
41 |
|
42 |
Notes:
|
43 |
- The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.
|
44 |
+
|
45 |
+
### Training hyperparameters
|
46 |
+
|
47 |
+
The following hyperparameters were used during training:
|
48 |
+
- learning_rate: 2e-05
|
49 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
50 |
+
- lr_scheduler_type: linear
|
51 |
+
- num_epochs: 3.0
|
52 |
+
- train_batch_size: 8
|
53 |
+
- eval_batch_size: 8
|
54 |
+
- eval_steps: 100
|
55 |
+
- load_best_model_at_end: True
|
56 |
+
- metric_for_best_model: f1
|
57 |
+
- early_stopping_patience = 6
|
58 |
+
- early_stopping_threshold = 0.001
|