bart-cnn-dailymail-seed42
This model is a fine-tuned version of facebook/bart-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.7126
- Rouge1: 0.2361
- Rouge2: 0.1189
- Rougel: 0.1981
- Rougelsum: 0.2238
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2000
- num_epochs: 3.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
2.2539 | 0.1115 | 1000 | 1.8901 | 0.2314 | 0.1104 | 0.1915 | 0.2177 |
2.1395 | 0.2229 | 2000 | 1.8539 | 0.2324 | 0.1110 | 0.1923 | 0.2186 |
2.0919 | 0.3344 | 3000 | 1.8197 | 0.2318 | 0.1137 | 0.1931 | 0.2186 |
2.0505 | 0.4458 | 4000 | 1.7912 | 0.2361 | 0.1178 | 0.1979 | 0.2234 |
2.0239 | 0.5573 | 5000 | 1.7712 | 0.2347 | 0.1174 | 0.1966 | 0.2221 |
1.9945 | 0.6687 | 6000 | 1.7501 | 0.2375 | 0.1187 | 0.1986 | 0.2246 |
1.9875 | 0.7802 | 7000 | 1.7361 | 0.2356 | 0.1183 | 0.1970 | 0.2225 |
1.9728 | 0.8916 | 8000 | 1.7404 | 0.2381 | 0.1193 | 0.1992 | 0.2252 |
1.9517 | 1.0031 | 9000 | 1.7336 | 0.2379 | 0.1191 | 0.1990 | 0.2249 |
1.8701 | 1.1145 | 10000 | 1.7226 | 0.2362 | 0.1186 | 0.1979 | 0.2233 |
1.8677 | 1.2260 | 11000 | 1.7126 | 0.2361 | 0.1189 | 0.1981 | 0.2238 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.4.0
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 1
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for tomvoelker/bart-cnn-dailymail-seed42
Base model
facebook/bart-base