collapse_gemma-2-2b_hs2_replace_iter5_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6329	0.0318	5	1.3100	261936
1.0812	0.0635	10	1.2393	527984
0.8509	0.0953	15	1.2939	798792
0.5856	0.1271	20	1.4435	1068224
0.3945	0.1589	25	1.5981	1333664
0.2591	0.1906	30	1.7370	1600920
0.2297	0.2224	35	1.9540	1862864
0.1491	0.2542	40	2.0318	2119104
0.0693	0.2859	45	2.2388	2377720
0.0509	0.3177	50	2.3196	2637816
0.0475	0.3495	55	2.3864	2900952
0.034	0.3813	60	2.4376	3166456
0.0324	0.4130	65	2.4449	3436144
0.034	0.4448	70	2.4523	3702280
0.0326	0.4766	75	2.4438	3966328
0.0336	0.5083	80	2.4354	4221440
0.0313	0.5401	85	2.4139	4486432
0.0283	0.5719	90	2.3846	4751320
0.0301	0.6037	95	2.3932	5019592
0.0284	0.6354	100	2.4044	5280712
0.0256	0.6672	105	2.4084	5539944
0.0329	0.6990	110	2.4300	5807632
0.0266	0.7307	115	2.4236	6068760
0.0267	0.7625	120	2.4100	6331712
0.0268	0.7943	125	2.4094	6593680
0.0272	0.8261	130	2.4229	6859744
0.0296	0.8578	135	2.4294	7118040
0.027	0.8896	140	2.4374	7383424
0.0264	0.9214	145	2.4434	7650680
0.0248	0.9531	150	2.4362	7915376
0.0264	0.9849	155	2.4400	8174680