rocket-3B / README.md
pansophic's picture
Update README.md
7e59f83
|
raw
history blame
1.79 kB

Task Performance Metrics

The following table displays the performance metrics for various tasks, including accuracy (acc) and normalized accuracy (acc_norm). The 'Value' column represents the accuracy, and 'Stderr' indicates the standard error for each metric.

Task Version Metric Value Stderr
arc_challenge 0 acc 0.4334 ± 0.0145
acc_norm 0.4394 ± 0.0145
---------------- ------------- ------------ ----------- ------------
arc_easy 0 acc 0.6974 ± 0.0094
acc_norm 0.6170 ± 0.0100
---------------- ------------- ------------ ----------- ------------
boolq 1 acc 0.8171 ± 0.0068
---------------- ------------- ------------ ----------- ------------
hellaswag 0 acc 0.5770 ± 0.0049
acc_norm 0.7391 ± 0.0044
---------------- ------------- ------------ ----------- ------------
openbookqa 0 acc 0.2800 ± 0.0201
acc_norm 0.3760 ± 0.0217
---------------- ------------- ------------ ----------- ------------
piqa 0 acc 0.7797 ± 0.0097
acc_norm 0.7622 ± 0.0099
---------------- ------------- ------------ ----------- ------------
winogrande 0 acc 0.6322 ± 0.0136
---------------- ------------- ------------ ----------- ------------

Average: 0.6261