dathi103 commited on
Commit
12ac50b
·
verified ·
1 Parent(s): c828833

End of training

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: deepset/gbert-base
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: gerskill-gbert
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # gerskill-gbert
15
+
16
+ This model is a fine-tuned version of [deepset/gbert-base](https://huggingface.co/deepset/gbert-base) on an unknown dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 0.1516
19
+ - Hard: {'precision': 0.6638023630504833, 'recall': 0.7696139476961394, 'f1': 0.7128027681660899, 'number': 803}
20
+ - Soft: {'precision': 0.6542553191489362, 'recall': 0.7935483870967742, 'f1': 0.7172011661807581, 'number': 155}
21
+ - Overall Precision: 0.6622
22
+ - Overall Recall: 0.7735
23
+ - Overall F1: 0.7135
24
+ - Overall Accuracy: 0.9526
25
+
26
+ ## Model description
27
+
28
+ More information needed
29
+
30
+ ## Intended uses & limitations
31
+
32
+ More information needed
33
+
34
+ ## Training and evaluation data
35
+
36
+ More information needed
37
+
38
+ ## Training procedure
39
+
40
+ ### Training hyperparameters
41
+
42
+ The following hyperparameters were used during training:
43
+ - learning_rate: 2e-05
44
+ - train_batch_size: 32
45
+ - eval_batch_size: 32
46
+ - seed: 42
47
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
+ - lr_scheduler_type: linear
49
+ - num_epochs: 5
50
+
51
+ ### Training results
52
+
53
+ | Training Loss | Epoch | Step | Validation Loss | Hard | Soft | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
54
+ |:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------:|:-----------------:|:--------------:|:----------:|:----------------:|
55
+ | No log | 1.0 | 158 | 0.1602 | {'precision': 0.5013054830287206, 'recall': 0.7173100871731009, 'f1': 0.5901639344262294, 'number': 803} | {'precision': 0.47639484978540775, 'recall': 0.7161290322580646, 'f1': 0.5721649484536083, 'number': 155} | 0.4971 | 0.7171 | 0.5872 | 0.9375 |
56
+ | No log | 2.0 | 316 | 0.1340 | {'precision': 0.600802407221665, 'recall': 0.7459526774595268, 'f1': 0.6655555555555556, 'number': 803} | {'precision': 0.605, 'recall': 0.7806451612903226, 'f1': 0.6816901408450703, 'number': 155} | 0.6015 | 0.7516 | 0.6682 | 0.9476 |
57
+ | No log | 3.0 | 474 | 0.1315 | {'precision': 0.6577825159914712, 'recall': 0.7683686176836861, 'f1': 0.7087880528431935, 'number': 803} | {'precision': 0.6631016042780749, 'recall': 0.8, 'f1': 0.7251461988304094, 'number': 155} | 0.6587 | 0.7735 | 0.7115 | 0.9522 |
58
+ | 0.1497 | 4.0 | 632 | 0.1456 | {'precision': 0.6789989118607181, 'recall': 0.7770859277708593, 'f1': 0.7247386759581882, 'number': 803} | {'precision': 0.5970873786407767, 'recall': 0.7935483870967742, 'f1': 0.6814404432132964, 'number': 155} | 0.664 | 0.7797 | 0.7172 | 0.9525 |
59
+ | 0.1497 | 5.0 | 790 | 0.1516 | {'precision': 0.6638023630504833, 'recall': 0.7696139476961394, 'f1': 0.7128027681660899, 'number': 803} | {'precision': 0.6542553191489362, 'recall': 0.7935483870967742, 'f1': 0.7172011661807581, 'number': 155} | 0.6622 | 0.7735 | 0.7135 | 0.9526 |
60
+
61
+
62
+ ### Framework versions
63
+
64
+ - Transformers 4.38.1
65
+ - Pytorch 2.1.2+cu121
66
+ - Datasets 2.18.0
67
+ - Tokenizers 0.15.2