collapse_gemma-2-27b_hs2_accumulate_iter2_sftsd0

This model is a fine-tuned version of google/gemma-2-27b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.1282	0
1.7181	0.0278	5	1.0178	258940
1.5727	0.0556	10	0.9732	523840
1.4846	0.0834	15	0.9611	775600
1.5742	0.1113	20	0.9573	1031780
1.5061	0.1391	25	0.9571	1291404
1.2746	0.1669	30	0.9544	1551680
1.2702	0.1947	35	0.9557	1808156
1.329	0.2225	40	0.9525	2060568
1.1092	0.2503	45	0.9495	2319496
0.9658	0.2782	50	0.9482	2567632
1.0994	0.3060	55	0.9444	2831744
1.0686	0.3338	60	0.9435	3087788
1.115	0.3616	65	0.9405	3340312
1.0044	0.3894	70	0.9375	3602000
1.1384	0.4172	75	0.9357	3868648
1.0943	0.4451	80	0.9361	4121888
1.0129	0.4729	85	0.9323	4375104
0.9281	0.5007	90	0.9314	4629144
0.9001	0.5285	95	0.9316	4881800
1.0471	0.5563	100	0.9303	5142288
1.0141	0.5841	105	0.9302	5398480
1.0427	0.6120	110	0.9280	5651544
0.9628	0.6398	115	0.9274	5904284
0.8986	0.6676	120	0.9257	6160992
0.9081	0.6954	125	0.9279	6427076
0.957	0.7232	130	0.9241	6686176
0.9556	0.7510	135	0.9246	6942364
0.9609	0.7789	140	0.9244	7193836
0.9889	0.8067	145	0.9228	7452352
0.9009	0.8345	150	0.9231	7708728
0.8942	0.8623	155	0.9217	7969644
0.9304	0.8901	160	0.9216	8223032
0.9462	0.9179	165	0.9212	8481188
0.9904	0.9458	170	0.9204	8743924
0.9147	0.9736	175	0.9204	8999112