File size: 5,070 Bytes
2a2f219
 
 
 
 
3b938dc
 
2a2f219
 
3b938dc
 
 
 
 
 
 
 
 
 
 
 
2a2f219
 
 
 
 
 
 
3b938dc
2a2f219
3b938dc
 
 
 
 
 
2a2f219
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3b938dc
 
 
2a2f219
 
 
 
 
 
3b938dc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2a2f219
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
tags:
- generated_from_trainer
datasets:
- xsum
metrics:
- rouge
model-index:
- name: XSum_t5-small_800_adafactor
  results:
  - task:
      name: Sequence-to-sequence Language Modeling
      type: text2text-generation
    dataset:
      name: xsum
      type: xsum
      args: default
    metrics:
    - name: Rouge1
      type: rouge
      value: 33.022
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# XSum_t5-small_800_adafactor

This model is a fine-tuned version of [/content/XSum_t5-small_800_adafactor/checkpoint-11000](https://huggingface.co//content/XSum_t5-small_800_adafactor/checkpoint-11000) on the xsum dataset.
It achieves the following results on the evaluation set:
- Loss: 2.1714
- Rouge1: 33.022
- Rouge2: 11.9979
- Rougel: 26.7476
- Rougelsum: 26.7402
- Gen Len: 18.7543

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 25
- eval_batch_size: 25
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 2.3404        | 0.01  | 100  | 2.2058          | 32.4826 | 11.5807 | 26.2716 | 26.2611   | 18.7842 |
| 2.3194        | 0.02  | 200  | 2.2028          | 32.6393 | 11.661  | 26.372  | 26.3643   | 18.788  |
| 2.3247        | 0.04  | 300  | 2.1999          | 32.6792 | 11.6985 | 26.3876 | 26.3786   | 18.7354 |
| 2.3276        | 0.05  | 400  | 2.1979          | 32.6668 | 11.7272 | 26.3964 | 26.3907   | 18.7957 |
| 2.317         | 0.06  | 500  | 2.1957          | 32.8267 | 11.8165 | 26.5075 | 26.4997   | 18.7543 |
| 2.3214        | 0.07  | 600  | 2.1942          | 32.8319 | 11.8064 | 26.5428 | 26.5448   | 18.7693 |
| 2.3014        | 0.09  | 700  | 2.1931          | 32.7136 | 11.7334 | 26.4958 | 26.486    | 18.7759 |
| 2.3294        | 0.1   | 800  | 2.1902          | 32.6818 | 11.7684 | 26.4314 | 26.4242   | 18.785  |
| 2.299         | 0.11  | 900  | 2.1914          | 32.672  | 11.7606 | 26.4475 | 26.4367   | 18.7853 |
| 2.3009        | 0.12  | 1000 | 2.1900          | 32.7816 | 11.7958 | 26.5167 | 26.5099   | 18.7685 |
| 2.2913        | 0.13  | 1100 | 2.1885          | 32.6438 | 11.7398 | 26.4077 | 26.4051   | 18.7742 |
| 2.293         | 0.15  | 1200 | 2.1854          | 32.8228 | 11.841  | 26.548  | 26.5415   | 18.7899 |
| 2.2857        | 0.16  | 1300 | 2.1853          | 32.7118 | 11.7439 | 26.4989 | 26.4941   | 18.7998 |
| 2.2921        | 0.17  | 1400 | 2.1832          | 32.6705 | 11.7333 | 26.4076 | 26.4082   | 18.8017 |
| 2.3074        | 0.18  | 1500 | 2.1827          | 32.7543 | 11.7787 | 26.4904 | 26.4923   | 18.7827 |
| 2.3044        | 0.2   | 1600 | 2.1806          | 32.8573 | 11.8672 | 26.5655 | 26.5619   | 18.8097 |
| 2.2922        | 0.21  | 1700 | 2.1819          | 32.8394 | 11.8158 | 26.5523 | 26.5467   | 18.7891 |
| 2.2901        | 0.22  | 1800 | 2.1803          | 32.7219 | 11.7493 | 26.4644 | 26.4572   | 18.7882 |
| 2.286         | 0.23  | 1900 | 2.1790          | 32.7474 | 11.852  | 26.5078 | 26.5014   | 18.7699 |
| 2.298         | 0.25  | 2000 | 2.1781          | 32.8662 | 11.8878 | 26.618  | 26.6174   | 18.7979 |
| 2.2787        | 0.26  | 2100 | 2.1775          | 32.9621 | 11.9521 | 26.6955 | 26.6914   | 18.7934 |
| 2.2823        | 0.27  | 2200 | 2.1777          | 33.0633 | 12.0622 | 26.7715 | 26.7597   | 18.7954 |
| 2.2889        | 0.28  | 2300 | 2.1742          | 32.9637 | 12.0154 | 26.6771 | 26.6721   | 18.7844 |
| 2.2847        | 0.29  | 2400 | 2.1774          | 32.7435 | 11.8869 | 26.5334 | 26.5306   | 18.756  |
| 2.2923        | 0.31  | 2500 | 2.1754          | 32.8437 | 11.8977 | 26.59   | 26.587    | 18.7964 |
| 2.2877        | 0.32  | 2600 | 2.1740          | 32.9137 | 11.9267 | 26.618  | 26.6046   | 18.7678 |
| 2.2976        | 0.33  | 2700 | 2.1728          | 32.9372 | 11.9048 | 26.6412 | 26.6345   | 18.7838 |
| 2.2935        | 0.34  | 2800 | 2.1719          | 32.7338 | 11.7836 | 26.5667 | 26.5629   | 18.7659 |
| 2.2622        | 0.36  | 2900 | 2.1718          | 32.9847 | 11.978  | 26.7093 | 26.7008   | 18.7627 |
| 2.2749        | 0.37  | 3000 | 2.1710          | 32.9835 | 11.9809 | 26.7034 | 26.6946   | 18.8016 |
| 2.2615        | 0.38  | 3100 | 2.1721          | 32.9343 | 11.9317 | 26.6752 | 26.6695   | 18.7689 |
| 2.2825        | 0.39  | 3200 | 2.1714          | 33.022  | 11.9979 | 26.7476 | 26.7402   | 18.7543 |


### Framework versions

- Transformers 4.20.1
- Pytorch 1.12.0+cu113
- Datasets 2.3.2
- Tokenizers 0.12.1