| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| |----------------|------:|------|-----:|--------|---|-----:|---|------| |kobest_boolq | 1|none | 5|acc |↑ |0.5726|± |0.0132| | | |none | 5|f1 |↑ |0.5725|± | N/A| |kobest_copa | 1|none | 5|acc |↑ |0.5200|± |0.0158| | | |none | 5|f1 |↑ |0.5189|± | N/A| |kobest_hellaswag| 1|none | 5|acc |↑ |0.3640|± |0.0215| | | |none | 5|acc_norm|↑ |0.4380|± |0.0222| | | |none | 5|f1 |↑ |0.3592|± | N/A| |kobest_sentineg | 1|none | 5|acc |↑ |0.5642|± |0.0249| | | |none | 5|f1 |↑ |0.5554|± | N/A| |kobest_wic | 1|none | 5|acc |↑ |0.5087|± |0.0141| | | |none | 5|f1 |↑ |0.4979|± | N/A| |Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr| |-----|------:|----------------|-----:|-----------|---|-----:|---|-----:| |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.2995|± |0.0126| | | |strict-match | 5|exact_match|↑ |0.2987|± |0.0126|