README.md · yongzx/pythia-160m-sft-hh at main

Model Evals:

Task	Version	Filter	Metric	Value		Stderr
arc_challenge	Yaml	none	acc	0.1877	±	0.0114
		none	acc_norm	0.2372	±	0.0124
arc_easy	Yaml	none	acc	0.4390	±	0.0102
		none	acc_norm	0.4082	±	0.0101
logiqa	Yaml	none	acc	0.1889	±	0.0154
		none	acc_norm	0.2473	±	0.0169
piqa	Yaml	none	acc	0.6213	±	0.0113
		none	acc_norm	0.6279	±	0.0113
sciq	Yaml	none	acc	0.7230	±	0.0142
		none	acc_norm	0.6840	±	0.0147
winogrande	Yaml	none	acc	0.5162	±	0.0140
wsc	Yaml	none	acc	0.3654	±	0.0474
lambada_openai	Yaml	none	perplexity	58.9478	±	2.7662
		none	acc	0.2602	±	0.0061