File size: 4,137 Bytes
a9abfb3 4f63808 5fec0d2 da96928 5fec0d2 4f63808 a9abfb3 900ab8a ab749b4 588d8b1 6c50202 4b2239e d135769 1a4bd54 5a16ca6 81e82ba c20324c 3073edb 62a8b89 2c42370 2cb8e5e a03ddbe 188ae59 5fec0d2 da96928 801cf65 4f63808 a9abfb3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
license: apache-2.0
base_model: google/mt5-large
tags:
- generated_from_keras_callback
model-index:
- name: pakawadeep/mt5-large-finetuned-ctfl-augmented_05
results: []
---
<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->
# pakawadeep/mt5-large-finetuned-ctfl-augmented_05
This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 0.2567
- Validation Loss: 0.6818
- Train Rouge1: 8.9109
- Train Rouge2: 1.3861
- Train Rougel: 8.9463
- Train Rougelsum: 8.9463
- Train Gen Len: 11.9109
- Epoch: 20
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
### Training results
| Train Loss | Validation Loss | Train Rouge1 | Train Rouge2 | Train Rougel | Train Rougelsum | Train Gen Len | Epoch |
|:----------:|:---------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:-----:|
| 5.3441 | 2.0990 | 3.1931 | 0.4400 | 3.2577 | 3.2151 | 12.2277 | 0 |
| 2.2977 | 1.5680 | 7.0014 | 1.0891 | 7.0651 | 6.9307 | 11.3267 | 1 |
| 1.7363 | 1.2611 | 7.0674 | 1.0891 | 7.1287 | 7.0745 | 11.5545 | 2 |
| 1.4302 | 1.0860 | 8.3805 | 2.4257 | 8.4158 | 8.4158 | 11.8069 | 3 |
| 1.2082 | 0.9516 | 8.3805 | 2.4257 | 8.4158 | 8.4158 | 11.8861 | 4 |
| 1.0516 | 0.8511 | 8.3805 | 2.4257 | 8.4158 | 8.4158 | 12.0149 | 5 |
| 0.9244 | 0.7961 | 8.9109 | 2.4257 | 8.9109 | 8.9109 | 11.9950 | 6 |
| 0.8280 | 0.7524 | 8.9109 | 2.3762 | 8.8755 | 8.9109 | 11.9802 | 7 |
| 0.7521 | 0.7230 | 8.9109 | 2.3762 | 8.8755 | 8.9109 | 11.9406 | 8 |
| 0.6888 | 0.6988 | 8.9109 | 2.3762 | 8.8755 | 8.9109 | 11.9307 | 9 |
| 0.6330 | 0.6676 | 8.6634 | 1.7822 | 8.6810 | 8.6103 | 11.9109 | 10 |
| 0.5835 | 0.6465 | 7.7793 | 1.2871 | 7.9208 | 7.9208 | 11.9010 | 11 |
| 0.5299 | 0.6289 | 8.4158 | 1.2871 | 8.4158 | 8.4335 | 11.9356 | 12 |
| 0.4876 | 0.6310 | 8.4158 | 1.2871 | 8.4158 | 8.4335 | 11.8911 | 13 |
| 0.4402 | 0.6207 | 8.4158 | 1.2871 | 8.4158 | 8.4335 | 11.9109 | 14 |
| 0.4068 | 0.6237 | 8.4158 | 1.2871 | 8.4158 | 8.4335 | 11.9158 | 15 |
| 0.3686 | 0.6314 | 8.4158 | 1.2871 | 8.4158 | 8.4335 | 11.9356 | 16 |
| 0.3359 | 0.6296 | 8.9109 | 1.2871 | 8.9109 | 8.9463 | 11.8960 | 17 |
| 0.3090 | 0.6569 | 8.9109 | 1.3861 | 8.9463 | 8.9463 | 11.8960 | 18 |
| 0.2774 | 0.6649 | 8.9109 | 1.3861 | 8.9463 | 8.9463 | 11.8762 | 19 |
| 0.2567 | 0.6818 | 8.9109 | 1.3861 | 8.9463 | 8.9463 | 11.9109 | 20 |
### Framework versions
- Transformers 4.41.2
- TensorFlow 2.15.0
- Datasets 2.20.0
- Tokenizers 0.19.1
|