End of training
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ model-index:
|
|
21 |
metrics:
|
22 |
- name: Rouge1
|
23 |
type: rouge
|
24 |
-
value: 0.
|
25 |
---
|
26 |
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -31,11 +31,11 @@ should probably proofread and complete it, then remove this comment. -->
|
|
31 |
|
32 |
This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on the cnn_dailymail dataset.
|
33 |
It achieves the following results on the evaluation set:
|
34 |
-
- Loss: 1.
|
35 |
-
- Rouge1: 0.
|
36 |
-
- Rouge2: 0.
|
37 |
-
- Rougel: 0.
|
38 |
-
- Rougelsum: 0.
|
39 |
|
40 |
## Model description
|
41 |
|
@@ -62,7 +62,8 @@ The following hyperparameters were used during training:
|
|
62 |
- total_train_batch_size: 128
|
63 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
64 |
- lr_scheduler_type: linear
|
65 |
-
- num_epochs:
|
|
|
66 |
|
67 |
### Training results
|
68 |
|
@@ -94,11 +95,29 @@ The following hyperparameters were used during training:
|
|
94 |
| 1.6554 | 5.35 | 12000 | 1.6044 | 0.3817 | 0.1695 | 0.2559 | 0.3605 |
|
95 |
| 1.6155 | 5.57 | 12500 | 1.6010 | 0.3825 | 0.1700 | 0.2561 | 0.3608 |
|
96 |
| 1.5863 | 5.8 | 13000 | 1.5981 | 0.3829 | 0.1704 | 0.2569 | 0.3614 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
97 |
|
98 |
|
99 |
### Framework versions
|
100 |
|
101 |
-
- Transformers 4.
|
102 |
-
- Pytorch
|
103 |
-
- Datasets 2.
|
104 |
-
- Tokenizers 0.13.
|
|
|
21 |
metrics:
|
22 |
- name: Rouge1
|
23 |
type: rouge
|
24 |
+
value: 0.3869876274946419
|
25 |
---
|
26 |
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
31 |
|
32 |
This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on the cnn_dailymail dataset.
|
33 |
It achieves the following results on the evaluation set:
|
34 |
+
- Loss: 1.5544
|
35 |
+
- Rouge1: 0.3870
|
36 |
+
- Rouge2: 0.1736
|
37 |
+
- Rougel: 0.2599
|
38 |
+
- Rougelsum: 0.3653
|
39 |
|
40 |
## Model description
|
41 |
|
|
|
62 |
- total_train_batch_size: 128
|
63 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
64 |
- lr_scheduler_type: linear
|
65 |
+
- num_epochs: 10
|
66 |
+
- mixed_precision_training: Native AMP
|
67 |
|
68 |
### Training results
|
69 |
|
|
|
95 |
| 1.6554 | 5.35 | 12000 | 1.6044 | 0.3817 | 0.1695 | 0.2559 | 0.3605 |
|
96 |
| 1.6155 | 5.57 | 12500 | 1.6010 | 0.3825 | 0.1700 | 0.2561 | 0.3608 |
|
97 |
| 1.5863 | 5.8 | 13000 | 1.5981 | 0.3829 | 0.1704 | 0.2569 | 0.3614 |
|
98 |
+
| 1.6306 | 6.02 | 13500 | 1.6004 | 0.3831 | 0.1702 | 0.2563 | 0.3618 |
|
99 |
+
| 1.6425 | 6.24 | 14000 | 1.5987 | 0.3821 | 0.1698 | 0.2561 | 0.3610 |
|
100 |
+
| 1.6863 | 6.46 | 14500 | 1.5876 | 0.3837 | 0.1710 | 0.2569 | 0.3622 |
|
101 |
+
| 1.6085 | 6.69 | 15000 | 1.5815 | 0.3836 | 0.1717 | 0.2573 | 0.3621 |
|
102 |
+
| 1.6267 | 6.91 | 15500 | 1.5792 | 0.3852 | 0.1722 | 0.2579 | 0.3633 |
|
103 |
+
| 1.5637 | 7.13 | 16000 | 1.5768 | 0.3830 | 0.1709 | 0.2568 | 0.3611 |
|
104 |
+
| 1.5586 | 7.36 | 16500 | 1.5740 | 0.3833 | 0.1706 | 0.2567 | 0.3617 |
|
105 |
+
| 1.5389 | 7.58 | 17000 | 1.5689 | 0.3858 | 0.1729 | 0.2590 | 0.3640 |
|
106 |
+
| 1.5694 | 7.8 | 17500 | 1.5645 | 0.3853 | 0.1731 | 0.2589 | 0.3636 |
|
107 |
+
| 1.5265 | 8.02 | 18000 | 1.5621 | 0.3871 | 0.1733 | 0.2596 | 0.3654 |
|
108 |
+
| 1.5273 | 8.25 | 18500 | 1.5624 | 0.3861 | 0.1726 | 0.2588 | 0.3646 |
|
109 |
+
| 1.5148 | 8.47 | 19000 | 1.5602 | 0.3866 | 0.1733 | 0.2592 | 0.3651 |
|
110 |
+
| 1.532 | 8.69 | 19500 | 1.5599 | 0.3859 | 0.1732 | 0.2593 | 0.3642 |
|
111 |
+
| 1.5113 | 8.92 | 20000 | 1.5602 | 0.3877 | 0.1748 | 0.2606 | 0.3658 |
|
112 |
+
| 1.5133 | 9.14 | 20500 | 1.5595 | 0.3855 | 0.1725 | 0.2587 | 0.3637 |
|
113 |
+
| 1.4875 | 9.36 | 21000 | 1.5572 | 0.3873 | 0.1741 | 0.2600 | 0.3654 |
|
114 |
+
| 1.5038 | 9.59 | 21500 | 1.5557 | 0.3860 | 0.1728 | 0.2590 | 0.3641 |
|
115 |
+
| 1.5062 | 9.81 | 22000 | 1.5544 | 0.3870 | 0.1736 | 0.2599 | 0.3653 |
|
116 |
|
117 |
|
118 |
### Framework versions
|
119 |
|
120 |
+
- Transformers 4.27.1
|
121 |
+
- Pytorch 2.0.0+cu118
|
122 |
+
- Datasets 2.10.1
|
123 |
+
- Tokenizers 0.13.2
|
generation_config.json
CHANGED
@@ -8,5 +8,5 @@
|
|
8 |
"min_length": 100,
|
9 |
"no_repeat_ngram_size": 3,
|
10 |
"pad_token_id": 1,
|
11 |
-
"transformers_version": "4.
|
12 |
}
|
|
|
8 |
"min_length": 100,
|
9 |
"no_repeat_ngram_size": 3,
|
10 |
"pad_token_id": 1,
|
11 |
+
"transformers_version": "4.27.1"
|
12 |
}
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 647680813
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0848cdfaffaf059ca087c492f3460b77b5b8bd36098390fbd024df55a0cba4a2
|
3 |
size 647680813
|
runs/Aug25_13-30-05_pop-os/events.out.tfevents.1692941410.pop-os.8422.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c42039ceca3e30852370ede3141d682ab1f754cf6d78f84e4451c3ecbd29cd78
|
3 |
+
size 81635
|