xsum_1677_bart-base

This model is a fine-tuned version of facebook/bart-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6469
Rouge1: 0.3879
Rouge2: 0.1787
Rougel: 0.3238
Rougelsum: 0.3238
Gen Len: 19.6644

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.8336	0.31	500	0.7274	0.3493	0.139	0.2847	0.2847	19.511
0.7963	0.63	1000	0.6994	0.3637	0.1506	0.2977	0.2976	19.6179
0.7543	0.94	1500	0.6876	0.365	0.1531	0.2999	0.2999	19.5356
0.7461	1.25	2000	0.6795	0.3709	0.1584	0.3052	0.3051	19.6224
0.7193	1.57	2500	0.6739	0.3684	0.1593	0.3048	0.3047	19.5721
0.7225	1.88	3000	0.6666	0.371	0.16	0.3063	0.3063	19.5672
0.6779	2.2	3500	0.6660	0.3745	0.1632	0.31	0.31	19.5619
0.673	2.51	4000	0.6618	0.3763	0.1653	0.3117	0.3117	19.6738
0.6848	2.82	4500	0.6578	0.3803	0.168	0.3145	0.3145	19.6308
0.6526	3.14	5000	0.6581	0.3803	0.1679	0.3141	0.3141	19.6503
0.6497	3.45	5500	0.6555	0.3776	0.1681	0.3132	0.3133	19.643
0.6483	3.76	6000	0.6520	0.3803	0.17	0.3153	0.3152	19.6666
0.6249	4.08	6500	0.6535	0.383	0.1736	0.3186	0.3185	19.6371
0.628	4.39	7000	0.6531	0.3825	0.1728	0.3181	0.318	19.6159
0.6288	4.7	7500	0.6495	0.3827	0.1727	0.3181	0.3181	19.6695
0.5921	5.02	8000	0.6509	0.3825	0.173	0.318	0.318	19.6447
0.6003	5.33	8500	0.6513	0.3833	0.1742	0.3198	0.3197	19.6866
0.5922	5.65	9000	0.6482	0.3837	0.1737	0.3195	0.3195	19.719
0.5878	5.96	9500	0.6483	0.3824	0.1737	0.3185	0.3185	19.6156
0.5646	6.27	10000	0.6503	0.3851	0.1754	0.3203	0.3204	19.6693
0.5753	6.59	10500	0.6473	0.3855	0.1761	0.3206	0.3206	19.6873
0.579	6.9	11000	0.6467	0.3861	0.1769	0.3223	0.3223	19.6635
0.5865	7.21	11500	0.6480	0.3862	0.176	0.3213	0.3212	19.7016
0.5746	7.53	12000	0.6480	0.3878	0.1785	0.3235	0.3236	19.6531
0.5678	7.84	12500	0.6460	0.3868	0.1776	0.3221	0.322	19.7039
0.5584	8.15	13000	0.6485	0.3875	0.178	0.3233	0.3233	19.6565
0.5484	8.47	13500	0.6477	0.3867	0.1777	0.3223	0.3224	19.6937
0.558	8.78	14000	0.6468	0.3873	0.1781	0.323	0.323	19.6823
0.5482	9.1	14500	0.6475	0.3878	0.1787	0.3231	0.3232	19.6896
0.5551	9.41	15000	0.6475	0.388	0.1783	0.3238	0.3237	19.666
0.5488	9.72	15500	0.6469	0.3879	0.1787	0.3238	0.3238	19.6644

Framework versions

Transformers 4.37.2
Pytorch 2.2.0+cu121
Datasets 2.16.1
Tokenizers 0.15.1

ryusangwon
/

xsum_1677_bart-base

xsum_1677_bart-base

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ryusangwon/xsum_1677_bart-base

Evaluation results