collapse_gemma-2-2b_hs2_replace_iter3_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5833	0.0321	5	1.3056	264760
1.4802	0.0642	10	1.2129	531232
1.0682	0.0963	15	1.2152	792048
0.877	0.1285	20	1.3193	1057288
0.5869	0.1606	25	1.4425	1314432
0.5375	0.1927	30	1.5556	1577064
0.3261	0.2248	35	1.6875	1842448
0.2372	0.2569	40	1.8303	2102536
0.1862	0.2890	45	1.9071	2365920
0.1235	0.3212	50	1.9770	2636264
0.1133	0.3533	55	2.0005	2893776
0.0811	0.3854	60	1.9080	3156592
0.0467	0.4175	65	1.9028	3412792
0.053	0.4496	70	1.9141	3681376
0.1024	0.4817	75	1.8865	3943248
0.0689	0.5138	80	1.8100	4209328
0.0592	0.5460	85	1.7858	4475792
0.0753	0.5781	90	1.7337	4742648
0.0373	0.6102	95	1.7169	5010136
0.0492	0.6423	100	1.7129	5275128
0.041	0.6744	105	1.7290	5545064
0.0362	0.7065	110	1.7868	5804720
0.0454	0.7387	115	1.8283	6071728
0.0387	0.7708	120	1.8346	6344272
0.058	0.8029	125	1.7726	6612848
0.0502	0.8350	130	1.7259	6885872
0.0512	0.8671	135	1.7473	7146016
0.0594	0.8992	140	1.8113	7410008
0.042	0.9314	145	1.8112	7672408
0.0445	0.9635	150	1.7757	7936232
0.0329	0.9956	155	1.8388	8195288