dathi103
/

gerskill-gbert

+---
+license: mit
+base_model: deepset/gbert-base
+tags:
+- generated_from_trainer
+model-index:
+- name: gerskill-gbert
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# gerskill-gbert
+This model is a fine-tuned version of [deepset/gbert-base](https://huggingface.co/deepset/gbert-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1516
+- Hard: {'precision': 0.6638023630504833, 'recall': 0.7696139476961394, 'f1': 0.7128027681660899, 'number': 803}
+- Soft: {'precision': 0.6542553191489362, 'recall': 0.7935483870967742, 'f1': 0.7172011661807581, 'number': 155}
+- Overall Precision: 0.6622
+- Overall Recall: 0.7735
+- Overall F1: 0.7135
+- Overall Accuracy: 0.9526
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Hard                                                                                                     | Soft                                                                                                      | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------:|:-----------------:|:--------------:|:----------:|:----------------:|
+| No log        | 1.0   | 158  | 0.1602          | {'precision': 0.5013054830287206, 'recall': 0.7173100871731009, 'f1': 0.5901639344262294, 'number': 803} | {'precision': 0.47639484978540775, 'recall': 0.7161290322580646, 'f1': 0.5721649484536083, 'number': 155} | 0.4971            | 0.7171         | 0.5872     | 0.9375           |
+| No log        | 2.0   | 316  | 0.1340          | {'precision': 0.600802407221665, 'recall': 0.7459526774595268, 'f1': 0.6655555555555556, 'number': 803}  | {'precision': 0.605, 'recall': 0.7806451612903226, 'f1': 0.6816901408450703, 'number': 155}               | 0.6015            | 0.7516         | 0.6682     | 0.9476           |
+| No log        | 3.0   | 474  | 0.1315          | {'precision': 0.6577825159914712, 'recall': 0.7683686176836861, 'f1': 0.7087880528431935, 'number': 803} | {'precision': 0.6631016042780749, 'recall': 0.8, 'f1': 0.7251461988304094, 'number': 155}                 | 0.6587            | 0.7735         | 0.7115     | 0.9522           |
+| 0.1497        | 4.0   | 632  | 0.1456          | {'precision': 0.6789989118607181, 'recall': 0.7770859277708593, 'f1': 0.7247386759581882, 'number': 803} | {'precision': 0.5970873786407767, 'recall': 0.7935483870967742, 'f1': 0.6814404432132964, 'number': 155}  | 0.664             | 0.7797         | 0.7172     | 0.9525           |
+| 0.1497        | 5.0   | 790  | 0.1516          | {'precision': 0.6638023630504833, 'recall': 0.7696139476961394, 'f1': 0.7128027681660899, 'number': 803} | {'precision': 0.6542553191489362, 'recall': 0.7935483870967742, 'f1': 0.7172011661807581, 'number': 155}  | 0.6622            | 0.7735         | 0.7135     | 0.9526           |
+### Framework versions
+- Transformers 4.38.1
+- Pytorch 2.1.2+cu121
+- Datasets 2.18.0
+- Tokenizers 0.15.2