Commit
·
3625b87
1
Parent(s):
a21b010
update model card README.md
Browse files
README.md
ADDED
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- generated_from_trainer
|
5 |
+
datasets:
|
6 |
+
- arxiv-summarization
|
7 |
+
metrics:
|
8 |
+
- rouge
|
9 |
+
model-index:
|
10 |
+
- name: arxiv-summarization-t5-base-2022-09-21
|
11 |
+
results:
|
12 |
+
- task:
|
13 |
+
name: Sequence-to-sequence Language Modeling
|
14 |
+
type: text2text-generation
|
15 |
+
dataset:
|
16 |
+
name: arxiv-summarization
|
17 |
+
type: arxiv-summarization
|
18 |
+
config: section
|
19 |
+
split: train
|
20 |
+
args: section
|
21 |
+
metrics:
|
22 |
+
- name: Rouge1
|
23 |
+
type: rouge
|
24 |
+
value: 19.2884
|
25 |
+
---
|
26 |
+
|
27 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
+
should probably proofread and complete it, then remove this comment. -->
|
29 |
+
|
30 |
+
# arxiv-summarization-t5-base-2022-09-21
|
31 |
+
|
32 |
+
This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the arxiv-summarization dataset.
|
33 |
+
It achieves the following results on the evaluation set:
|
34 |
+
- Loss: 1.8655
|
35 |
+
- Rouge1: 19.2884
|
36 |
+
- Rouge2: 7.8087
|
37 |
+
- Rougel: 15.4025
|
38 |
+
- Rougelsum: 17.5856
|
39 |
+
- Gen Len: 19.0
|
40 |
+
|
41 |
+
## Model description
|
42 |
+
|
43 |
+
More information needed
|
44 |
+
|
45 |
+
## Intended uses & limitations
|
46 |
+
|
47 |
+
More information needed
|
48 |
+
|
49 |
+
## Training and evaluation data
|
50 |
+
|
51 |
+
More information needed
|
52 |
+
|
53 |
+
## Training procedure
|
54 |
+
|
55 |
+
### Training hyperparameters
|
56 |
+
|
57 |
+
The following hyperparameters were used during training:
|
58 |
+
- learning_rate: 5e-05
|
59 |
+
- train_batch_size: 1
|
60 |
+
- eval_batch_size: 1
|
61 |
+
- seed: 42
|
62 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
63 |
+
- lr_scheduler_type: linear
|
64 |
+
- num_epochs: 3.0
|
65 |
+
|
66 |
+
### Training results
|
67 |
+
|
68 |
+
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
69 |
+
|:-------------:|:-----:|:------:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
|
70 |
+
| 2.3291 | 0.05 | 10000 | 2.1906 | 18.6571 | 7.1341 | 14.8347 | 16.9545 | 19.0 |
|
71 |
+
| 2.2454 | 0.1 | 20000 | 2.1549 | 18.5037 | 7.1908 | 14.7141 | 16.8233 | 18.9997 |
|
72 |
+
| 2.2107 | 0.15 | 30000 | 2.1013 | 18.7638 | 7.326 | 14.9437 | 17.072 | 19.0 |
|
73 |
+
| 2.1486 | 0.2 | 40000 | 2.0845 | 18.6879 | 7.2441 | 14.8835 | 16.983 | 19.0 |
|
74 |
+
| 2.158 | 0.25 | 50000 | 2.0699 | 18.8314 | 7.3712 | 15.0166 | 17.1215 | 19.0 |
|
75 |
+
| 2.1476 | 0.3 | 60000 | 2.0424 | 18.9783 | 7.4138 | 15.1121 | 17.2778 | 18.9981 |
|
76 |
+
| 2.1164 | 0.34 | 70000 | 2.0349 | 18.9257 | 7.4649 | 15.0335 | 17.1819 | 19.0 |
|
77 |
+
| 2.079 | 0.39 | 80000 | 2.0208 | 18.643 | 7.4096 | 14.8927 | 16.9786 | 18.9994 |
|
78 |
+
| 2.101 | 0.44 | 90000 | 2.0113 | 19.3881 | 7.7012 | 15.3981 | 17.6516 | 19.0 |
|
79 |
+
| 2.0576 | 0.49 | 100000 | 2.0022 | 18.9985 | 7.542 | 15.1157 | 17.2972 | 18.9992 |
|
80 |
+
| 2.0983 | 0.54 | 110000 | 1.9941 | 18.7691 | 7.4625 | 15.0256 | 17.1146 | 19.0 |
|
81 |
+
| 2.053 | 0.59 | 120000 | 1.9855 | 19.002 | 7.5602 | 15.1497 | 17.2963 | 19.0 |
|
82 |
+
| 2.0434 | 0.64 | 130000 | 1.9786 | 19.2385 | 7.6533 | 15.3094 | 17.5439 | 18.9994 |
|
83 |
+
| 2.0354 | 0.69 | 140000 | 1.9746 | 19.184 | 7.7307 | 15.2897 | 17.491 | 18.9992 |
|
84 |
+
| 2.0347 | 0.74 | 150000 | 1.9639 | 19.2408 | 7.693 | 15.3357 | 17.5297 | 19.0 |
|
85 |
+
| 2.0236 | 0.79 | 160000 | 1.9590 | 19.0781 | 7.6256 | 15.1932 | 17.3486 | 18.9998 |
|
86 |
+
| 2.0187 | 0.84 | 170000 | 1.9532 | 19.0343 | 7.6792 | 15.1884 | 17.3519 | 19.0 |
|
87 |
+
| 1.9939 | 0.89 | 180000 | 1.9485 | 18.8247 | 7.5005 | 15.0246 | 17.1485 | 18.9998 |
|
88 |
+
| 1.9961 | 0.94 | 190000 | 1.9504 | 19.0695 | 7.6559 | 15.2139 | 17.3814 | 19.0 |
|
89 |
+
| 2.0197 | 0.99 | 200000 | 1.9399 | 19.2821 | 7.6685 | 15.3029 | 17.5374 | 18.9988 |
|
90 |
+
| 1.9457 | 1.03 | 210000 | 1.9350 | 19.053 | 7.6502 | 15.2123 | 17.3793 | 19.0 |
|
91 |
+
| 1.9552 | 1.08 | 220000 | 1.9317 | 19.1878 | 7.7235 | 15.3272 | 17.5252 | 18.9998 |
|
92 |
+
| 1.9772 | 1.13 | 230000 | 1.9305 | 19.0855 | 7.6303 | 15.1943 | 17.3942 | 18.9997 |
|
93 |
+
| 1.9171 | 1.18 | 240000 | 1.9291 | 19.0711 | 7.6437 | 15.2175 | 17.3893 | 18.9995 |
|
94 |
+
| 1.9393 | 1.23 | 250000 | 1.9230 | 19.276 | 7.725 | 15.3826 | 17.586 | 18.9995 |
|
95 |
+
| 1.9295 | 1.28 | 260000 | 1.9197 | 19.2999 | 7.7958 | 15.3961 | 17.6056 | 18.9975 |
|
96 |
+
| 1.9725 | 1.33 | 270000 | 1.9173 | 19.2958 | 7.7121 | 15.3659 | 17.584 | 19.0 |
|
97 |
+
| 1.9668 | 1.38 | 280000 | 1.9129 | 19.089 | 7.6846 | 15.2395 | 17.3879 | 18.9998 |
|
98 |
+
| 1.941 | 1.43 | 290000 | 1.9132 | 19.2127 | 7.7336 | 15.311 | 17.4742 | 18.9995 |
|
99 |
+
| 1.9427 | 1.48 | 300000 | 1.9108 | 19.217 | 7.7591 | 15.334 | 17.53 | 18.9998 |
|
100 |
+
| 1.9521 | 1.53 | 310000 | 1.9041 | 19.1285 | 7.6736 | 15.2625 | 17.458 | 19.0 |
|
101 |
+
| 1.9352 | 1.58 | 320000 | 1.9041 | 19.1656 | 7.723 | 15.3035 | 17.4818 | 18.9991 |
|
102 |
+
| 1.9342 | 1.63 | 330000 | 1.9004 | 19.2573 | 7.7766 | 15.3558 | 17.5382 | 19.0 |
|
103 |
+
| 1.9631 | 1.68 | 340000 | 1.8978 | 19.236 | 7.7584 | 15.3408 | 17.4993 | 18.9998 |
|
104 |
+
| 1.8987 | 1.72 | 350000 | 1.8968 | 19.1716 | 7.7231 | 15.2836 | 17.4655 | 18.9997 |
|
105 |
+
| 1.9433 | 1.77 | 360000 | 1.8924 | 19.2644 | 7.8294 | 15.4018 | 17.5808 | 18.9998 |
|
106 |
+
| 1.9159 | 1.82 | 370000 | 1.8912 | 19.1833 | 7.8267 | 15.3175 | 17.4918 | 18.9995 |
|
107 |
+
| 1.9516 | 1.87 | 380000 | 1.8856 | 19.3077 | 7.7432 | 15.3723 | 17.6115 | 19.0 |
|
108 |
+
| 1.9218 | 1.92 | 390000 | 1.8880 | 19.2668 | 7.8231 | 15.3834 | 17.5701 | 18.9994 |
|
109 |
+
| 1.9159 | 1.97 | 400000 | 1.8860 | 19.2224 | 7.7903 | 15.3488 | 17.4992 | 18.9997 |
|
110 |
+
| 1.8741 | 2.02 | 410000 | 1.8854 | 19.2572 | 7.741 | 15.3405 | 17.5351 | 19.0 |
|
111 |
+
| 1.8668 | 2.07 | 420000 | 1.8854 | 19.3658 | 7.8593 | 15.4418 | 17.656 | 18.9995 |
|
112 |
+
| 1.8638 | 2.12 | 430000 | 1.8831 | 19.305 | 7.8218 | 15.3843 | 17.5861 | 18.9997 |
|
113 |
+
| 1.8334 | 2.17 | 440000 | 1.8817 | 19.3269 | 7.8249 | 15.4231 | 17.5958 | 18.9994 |
|
114 |
+
| 1.8893 | 2.22 | 450000 | 1.8803 | 19.2949 | 7.7885 | 15.3947 | 17.585 | 18.9997 |
|
115 |
+
| 1.8929 | 2.27 | 460000 | 1.8783 | 19.291 | 7.8346 | 15.428 | 17.5797 | 18.9997 |
|
116 |
+
| 1.861 | 2.32 | 470000 | 1.8766 | 19.4284 | 7.8832 | 15.4746 | 17.6946 | 18.9997 |
|
117 |
+
| 1.8719 | 2.37 | 480000 | 1.8751 | 19.1525 | 7.7641 | 15.3348 | 17.47 | 18.9998 |
|
118 |
+
| 1.8889 | 2.41 | 490000 | 1.8742 | 19.1743 | 7.768 | 15.3292 | 17.4665 | 18.9998 |
|
119 |
+
| 1.8834 | 2.46 | 500000 | 1.8723 | 19.3069 | 7.7935 | 15.3987 | 17.5913 | 18.9998 |
|
120 |
+
| 1.8564 | 2.51 | 510000 | 1.8695 | 19.3217 | 7.8292 | 15.4063 | 17.6081 | 19.0 |
|
121 |
+
| 1.8706 | 2.56 | 520000 | 1.8697 | 19.294 | 7.8217 | 15.3964 | 17.581 | 18.9998 |
|
122 |
+
| 1.883 | 2.61 | 530000 | 1.8703 | 19.2784 | 7.8634 | 15.404 | 17.5942 | 18.9995 |
|
123 |
+
| 1.8622 | 2.66 | 540000 | 1.8677 | 19.3165 | 7.8378 | 15.4259 | 17.6064 | 18.9988 |
|
124 |
+
| 1.8781 | 2.71 | 550000 | 1.8676 | 19.3237 | 7.7954 | 15.3995 | 17.6008 | 19.0 |
|
125 |
+
| 1.8793 | 2.76 | 560000 | 1.8685 | 19.2141 | 7.7605 | 15.3345 | 17.5268 | 18.9997 |
|
126 |
+
| 1.8795 | 2.81 | 570000 | 1.8675 | 19.2694 | 7.8082 | 15.3996 | 17.5831 | 19.0 |
|
127 |
+
| 1.8425 | 2.86 | 580000 | 1.8659 | 19.2886 | 7.7987 | 15.4005 | 17.5859 | 18.9997 |
|
128 |
+
| 1.8605 | 2.91 | 590000 | 1.8650 | 19.2778 | 7.7934 | 15.3931 | 17.5809 | 18.9997 |
|
129 |
+
| 1.8448 | 2.96 | 600000 | 1.8655 | 19.2884 | 7.8087 | 15.4025 | 17.5856 | 19.0 |
|
130 |
+
|
131 |
+
|
132 |
+
### Framework versions
|
133 |
+
|
134 |
+
- Transformers 4.23.0.dev0
|
135 |
+
- Pytorch 1.12.0
|
136 |
+
- Datasets 2.5.1
|
137 |
+
- Tokenizers 0.13.0
|