collapse_gemma-2-2b_hs2_replace_iter4_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5484	0.0318	5	1.3101	278776
1.085	0.0636	10	1.2389	553696
0.9004	0.0954	15	1.2791	827056
0.6616	0.1272	20	1.4012	1104568
0.4986	0.1590	25	1.5346	1374176
0.3579	0.1908	30	1.6346	1652112
0.174	0.2226	35	1.8294	1920080
0.1501	0.2544	40	1.9929	2190160
0.0726	0.2862	45	2.1418	2461008
0.0577	0.3180	50	2.2236	2733848
0.0399	0.3498	55	2.2644	3013016
0.04	0.3816	60	2.2704	3278704
0.0398	0.4134	65	2.2606	3562656
0.0475	0.4452	70	2.2814	3834984
0.0332	0.4769	75	2.2999	4109144
0.0352	0.5087	80	2.2843	4381312
0.0369	0.5405	85	2.2426	4652664
0.0333	0.5723	90	2.2229	4923152
0.029	0.6041	95	2.2462	5195000
0.0306	0.6359	100	2.2501	5458808
0.0307	0.6677	105	2.2394	5732184
0.0367	0.6995	110	2.2141	6004504
0.03	0.7313	115	2.2083	6272888
0.0282	0.7631	120	2.2138	6546192
0.0288	0.7949	125	2.2392	6808560
0.0292	0.8267	130	2.2351	7076096
0.028	0.8585	135	2.2202	7349312
0.0303	0.8903	140	2.2292	7627368
0.0285	0.9221	145	2.2498	7893744
0.0261	0.9539	150	2.2720	8162232
0.0317	0.9857	155	2.2699	8431896