Muennighoff's picture
Merge eval
5443e66
task,metric,value,err,version
anli_r1,acc,0.32,0.014758652303574886,0
anli_r2,acc,0.324,0.014806864733738854,0
anli_r3,acc,0.3491666666666667,0.01376707539507725,0
arc_challenge,acc,0.2901023890784983,0.013261573677520764,0
arc_challenge,acc_norm,0.30119453924914674,0.013406741767847638,0
arc_easy,acc,0.6342592592592593,0.009882988069418829,0
arc_easy,acc_norm,0.5837542087542088,0.01011481940450087,0
boolq,acc,0.5409785932721712,0.008715635308774412,1
cb,acc,0.5535714285714286,0.06703189227942397,1
cb,f1,0.3890671420083185,,1
copa,acc,0.75,0.04351941398892446,0
hellaswag,acc,0.4640509858593906,0.0049768677965835555,0
hellaswag,acc_norm,0.6082453694483171,0.004871447106554927,0
piqa,acc,0.7551686615886833,0.010032309105568793,0
piqa,acc_norm,0.766050054406964,0.009877236895137436,0
rte,acc,0.5451263537906137,0.029973636495415252,0
sciq,acc,0.896,0.009658016218524301,0
sciq,acc_norm,0.88,0.010281328012747386,0
storycloze_2016,acc,0.711918760021379,0.010472537019822582,0
winogrande,acc,0.574585635359116,0.013895257666646378,0