Intel
/

bert-base-uncased-mrpc-int8-qat-inc

Text Classification

text-classfication

Intel® Neural Compressor

QuantizationAwareTraining

Inference Endpoints

Model card Files Files and versions Community

xinhe commited on Apr 11, 2022

Commit

1af1979

•

1 Parent(s): 72920eb

Update README.md

Files changed (1) hide show

README.md +16 -16

README.md CHANGED Viewed

@@ -19,21 +19,6 @@ This is an INT8  PyTorch model quantized by [intel/nlp-toolkit](https://github.c
 The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
-#### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- num_epochs: 3.0
-- train_batch_size: 8
-- eval_batch_size: 8
-- eval_steps: 100
-- load_best_model_at_end: True
-- metric_for_best_model: f1
-- early_stopping_patience = 6
-- early_stopping_threshold = 0.001
 ### Test result
 - Batch size = 8
@@ -42,7 +27,7 @@ The following hyperparameters were used during training:
 |   |INT8|FP32|
 |---|:---:|:---:|
 | **Throughput (samples/sec)**  |24.263|11.202|
-| **Accuracy (eval-accuracy)** |0.9153|0.9042|
 | **Model size (MB)**  |174|418|
 ### Load with nlp-toolkit:
@@ -56,3 +41,18 @@ int8_model = OptimizedModel.from_pretrained(
 Notes:
  - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.

 The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
 ### Test result
 - Batch size = 8
 |   |INT8|FP32|
 |---|:---:|:---:|
 | **Throughput (samples/sec)**  |24.263|11.202|
+| **Accuracy (eval-f1)** |0.9153|0.9042|
 | **Model size (MB)**  |174|418|
 ### Load with nlp-toolkit:
 Notes:
  - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 3.0
+- train_batch_size: 8
+- eval_batch_size: 8
+- eval_steps: 100
+- load_best_model_at_end: True
+- metric_for_best_model: f1
+- early_stopping_patience = 6
+- early_stopping_threshold = 0.001