barthez-deft-sciences_de_l_information

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

Loss: 2.0258
Rouge1: 34.5672
Rouge2: 16.7861
Rougel: 27.5573
Rougelsum: 27.6099
Gen Len: 17.8857

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
3.3405	1.0	106	2.3682	31.3511	12.1973	25.6977	25.6851	14.9714
2.4219	2.0	212	2.1891	30.1154	13.3459	25.4854	25.5403	14.0429
2.0789	3.0	318	2.0994	32.153	15.3865	26.1859	26.1672	15.2
1.869	4.0	424	2.0258	34.5797	16.4194	27.6909	27.7201	16.9857
1.6569	5.0	530	2.0417	34.3854	16.5237	28.7036	28.8258	15.2429
1.5414	6.0	636	2.0503	33.1768	15.4851	27.2818	27.2884	16.0143
1.4461	7.0	742	2.0293	35.4273	16.118	27.3622	27.393	16.6857
1.3435	8.0	848	2.0336	35.3471	15.9695	27.668	27.6749	17.2
1.2624	9.0	954	2.0779	35.9201	17.2547	27.409	27.3293	17.1857
1.1807	10.0	1060	2.1301	35.7061	15.9138	27.3968	27.4716	17.1286
1.0972	11.0	1166	2.1726	34.3194	16.1313	27.0367	27.0737	17.1429
1.0224	12.0	1272	2.1704	34.9278	16.7958	27.8754	27.932	16.6571
1.0181	13.0	1378	2.2458	34.472	15.9111	28.2938	28.2946	16.7571
0.9769	14.0	1484	2.3405	35.1592	16.3135	29.0956	29.0858	16.5429
0.8866	15.0	1590	2.3303	34.8732	15.6709	27.5858	27.6169	16.2429
0.8888	16.0	1696	2.2976	35.3034	16.8011	27.7988	27.7569	17.5143
0.8358	17.0	1802	2.3349	35.505	16.8851	28.3651	28.413	16.8143
0.8026	18.0	1908	2.3738	35.2328	17.0358	28.544	28.6211	16.6143
0.7487	19.0	2014	2.4103	34.0793	15.4468	27.8057	27.8586	16.7286
0.7722	20.0	2120	2.3991	34.8116	15.8706	27.9173	27.983	16.9286

Framework versions

Transformers 4.10.2
Pytorch 1.7.1+cu110
Datasets 1.11.0
Tokenizers 0.10.3

jogonba2
/

barthez-deft-sciences_de_l_information

barthez-deft-sciences_de_l_information

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results