Bn_GEDC

This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0461
Wer: 0.07
Bleu: 0.847

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
num_epochs: 2

Training results

Training Loss	Epoch	Step	Bleu	Validation Loss	Wer
0.461	0.0245	2000	0.604	0.0894	0.185
0.0683	0.0490	4000	0.683	0.0677	0.144
0.052	0.0735	6000	0.71	0.0621	0.134
0.0427	0.0980	8000	0.732	0.0572	0.121
0.0373	0.1225	10000	0.749	0.0531	0.113
0.0335	0.1470	12000	0.759	0.0514	0.108
0.0397	0.1715	14000	0.77	0.0506	0.103
0.029	0.1960	16000	0.772	0.0508	0.103
0.0277	0.2205	18000	0.779	0.0496	0.099
0.0284	0.2450	20000	0.785	0.0468	0.096
0.0249	0.2695	22000	0.785	0.0479	0.097
0.0239	0.2940	24000	0.787	0.0481	0.095
0.0229	0.3185	26000	0.791	0.0473	0.094
0.0223	0.3430	28000	0.795	0.0461	0.092
0.0216	0.3674	30000	0.798	0.0471	0.091
0.0209	0.3919	32000	0.798	0.0467	0.091
0.0203	0.4164	34000	0.802	0.0464	0.089
0.0202	0.4409	36000	0.806	0.0454	0.087
0.0194	0.4654	38000	0.806	0.0462	0.087
0.0187	0.4899	40000	0.806	0.0471	0.087
0.0184	0.5144	42000	0.809	0.0462	0.086
0.0179	0.5389	44000	0.811	0.0444	0.085
0.0176	0.5634	46000	0.812	0.0460	0.085
0.0174	0.5879	48000	0.811	0.0469	0.086
0.0171	0.6124	50000	0.813	0.0465	0.084
0.0166	0.6369	52000	0.816	0.0446	0.083
0.016	0.6614	54000	0.816	0.0461	0.083
0.0162	0.6859	56000	0.818	0.0451	0.082
0.0158	0.7104	58000	0.819	0.0449	0.082
0.0156	0.7349	60000	0.818	0.0454	0.082
0.0157	0.7594	62000	0.82	0.0455	0.082
0.015	0.7839	64000	0.822	0.0455	0.081
0.0148	0.8084	66000	0.822	0.0461	0.081
0.0146	0.8329	68000	0.823	0.0460	0.08
0.0145	0.8574	70000	0.824	0.0446	0.08
0.0144	0.8819	72000	0.824	0.0450	0.079
0.0141	0.9064	74000	0.822	0.0477	0.081
0.0139	0.9309	76000	0.826	0.0446	0.079
0.0137	0.9554	78000	0.827	0.0452	0.078
0.0136	0.9799	80000	0.827	0.0455	0.078
0.0128	1.0044	82000	0.829	0.0462	0.078
0.0104	1.0289	84000	0.829	0.0456	0.077
0.0105	1.0534	86000	0.829	0.0465	0.078
0.0103	1.0779	88000	0.831	0.0443	0.077
0.01	1.1023	90000	0.829	0.0456	0.077
0.0103	1.1268	92000	0.83	0.0466	0.077
0.0101	1.1513	94000	0.832	0.0462	0.076
0.01	1.1758	96000	0.832	0.0458	0.076
0.01	1.2003	98000	0.834	0.0460	0.075
0.0098	1.2248	100000	0.834	0.0464	0.076
0.0098	1.2493	102000	0.834	0.0455	0.075
0.0096	1.2738	104000	0.836	0.0453	0.075
0.0099	1.2983	106000	0.835	0.0469	0.075
0.0095	1.3228	108000	0.836	0.0466	0.075
0.0094	1.3473	110000	0.836	0.0461	0.075
0.0094	1.3718	112000	0.837	0.0465	0.074
0.0093	1.3963	114000	0.838	0.0469	0.074
0.0092	1.4208	116000	0.838	0.0469	0.074
0.0092	1.4453	118000	0.838	0.0476	0.074
0.0092	1.4698	120000	0.839	0.0466	0.074
0.0091	1.4943	122000	0.841	0.0462	0.072
0.0089	1.5188	124000	0.839	0.0470	0.074
0.0088	1.5433	126000	0.839	0.0473	0.073
0.0087	1.5678	128000	0.841	0.0457	0.073
0.0086	1.5923	130000	0.843	0.0453	0.072
0.0085	1.6168	132000	0.841	0.0471	0.073
0.0086	1.6413	134000	0.842	0.0471	0.072
0.0086	1.6658	136000	0.844	0.0446	0.072
0.0082	1.6903	138000	0.844	0.0458	0.071
0.008	1.7148	140000	0.845	0.0460	0.071
0.0078	1.7393	142000	0.846	0.0460	0.071
0.008	1.7638	144000	0.846	0.0456	0.07
0.0077	1.7883	146000	0.847	0.0461	0.071
0.0077	1.8127	148000	0.847	0.0460	0.07
0.0077	1.8372	150000	0.847	0.0464	0.07
0.0076	1.8617	152000	0.847	0.0463	0.07
0.0076	1.8862	154000	0.847	0.0462	0.07
0.0076	1.9107	156000	0.847	0.0461	0.07

Framework versions

Transformers 4.40.2
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Virus-Proton
/

Bn_GEDC

Bn_GEDC

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Virus-Proton/Bn_GEDC

Evaluation results