duo-predict-gpt2-medium-wikitext

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Accuracy	Perplexity	Bleu
7.6654	0.1403	500	3.7315	0.0073	41.7396	1.0
7.0276	0.2807	1000	3.4735	0.0073	32.2490	1.0
6.4629	0.4210	1500	3.1863	0.0073	24.1987	1.0
5.9671	0.5613	2000	2.9542	0.0073	19.1873	1.0
5.6969	0.7017	2500	2.8233	0.0073	16.8331	1.0
5.5077	0.8420	3000	2.7351	0.0073	15.4112	1.0
5.3536	0.9823	3500	2.6607	0.0073	14.3059	1.0
5.2099	1.1226	4000	2.6000	0.0073	13.4641	1.0
5.1158	1.2630	4500	2.5493	0.0073	12.7980	1.0
5.0453	1.4033	5000	2.5125	0.0073	12.3362	1.0
4.955	1.5436	5500	2.4806	0.0073	11.9489	1.0
4.9157	1.6840	6000	2.4537	0.0073	11.6310	1.0
4.8756	1.8243	6500	2.4300	0.0073	11.3584	1.0
4.844	1.9646	7000	2.4100	0.0073	11.1342	1.0
4.7136	2.1050	7500	2.3948	0.0073	10.9657	1.0
4.6911	2.2453	8000	2.3805	0.0073	10.8105	1.0
4.6741	2.3856	8500	2.3668	0.0073	10.6637	1.0
4.6485	2.5260	9000	2.3538	0.0073	10.5257	1.0
4.623	2.6663	9500	2.3416	0.0073	10.3976	1.0
4.6016	2.8066	10000	2.3303	0.0073	10.2806	1.0
4.5823	2.9470	10500	2.3202	0.0073	10.1776	1.0
4.4802	3.0873	11000	2.3143	0.0073	10.1182	1.0
4.4671	3.2276	11500	2.3073	0.0073	10.0469	1.0
4.4557	3.3679	12000	2.3006	0.0073	9.9800	1.0
4.4437	3.5083	12500	2.2928	0.0073	9.9023	1.0
4.4402	3.6486	13000	2.2862	0.0073	9.8375	1.0
4.4482	3.7889	13500	2.2800	0.0073	9.7763	1.0
4.4279	3.9293	14000	2.2752	0.0073	9.7303	1.0
4.3188	4.0696	14500	2.2730	0.0073	9.7087	1.0
4.3193	4.2099	15000	2.2691	0.0073	9.6704	1.0
4.3158	4.3503	15500	2.2652	0.0073	9.6329	1.0
4.3196	4.4906	16000	2.2619	0.0073	9.6012	1.0
4.2946	4.6309	16500	2.2589	0.0073	9.5722	1.0
4.3078	4.7713	17000	2.2564	0.0073	9.5487	1.0
4.2974	4.9116	17500	2.2546	0.0073	9.5311	1.0