Update README.md
Browse files
README.md
CHANGED
@@ -96,6 +96,42 @@ IFEval with load_in_4bit:
|
|
96 |
| | |none | 0|prompt_level_loose_acc |↑ |0.6691|± |0.0202|
|
97 |
| | |none | 0|prompt_level_strict_acc|↑ |0.6285|± |0.0208|
|
98 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
99 |
Sonnet sampling for style impressions:
|
100 |
```
|
101 |
Upon the Dawn's Awakening
|
|
|
96 |
| | |none | 0|prompt_level_loose_acc |↑ |0.6691|± |0.0202|
|
97 |
| | |none | 0|prompt_level_strict_acc|↑ |0.6285|± |0.0208|
|
98 |
|
99 |
+
GPQA with load_in_4bit:
|
100 |
+
|
101 |
+
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
102 |
+
|-------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|
103 |
+
|gpqa_diamond_cot_n_shot | 2|flexible-extract| 0|exact_match|↑ |0.2273|± |0.0299|
|
104 |
+
| | |strict-match | 0|exact_match|↑ |0.0051|± |0.0051|
|
105 |
+
|gpqa_diamond_cot_zeroshot | 1|flexible-extract| 0|exact_match|↑ |0.1364|± |0.0245|
|
106 |
+
| | |strict-match | 0|exact_match|↑ |0.0000|± |0.0000|
|
107 |
+
|gpqa_diamond_generative_n_shot | 2|flexible-extract| 0|exact_match|↑ |0.2980|± |0.0326|
|
108 |
+
| | |strict-match | 0|exact_match|↑ |0.0152|± |0.0087|
|
109 |
+
|gpqa_diamond_n_shot | 2|none | 0|acc |↑ |0.2121|± |0.0291|
|
110 |
+
| | |none | 0|acc_norm |↑ |0.2121|± |0.0291|
|
111 |
+
|gpqa_diamond_zeroshot | 1|none | 0|acc |↑ |0.4192|± |0.0352|
|
112 |
+
| | |none | 0|acc_norm |↑ |0.4192|± |0.0352|
|
113 |
+
|gpqa_extended_cot_n_shot | 2|flexible-extract| 0|exact_match|↑ |0.2179|± |0.0177|
|
114 |
+
| | |strict-match | 0|exact_match|↑ |0.0000|± |0.0000|
|
115 |
+
|gpqa_extended_cot_zeroshot | 1|flexible-extract| 0|exact_match|↑ |0.1538|± |0.0155|
|
116 |
+
| | |strict-match | 0|exact_match|↑ |0.0055|± |0.0032|
|
117 |
+
|gpqa_extended_generative_n_shot| 2|flexible-extract| 0|exact_match|↑ |0.2821|± |0.0193|
|
118 |
+
| | |strict-match | 0|exact_match|↑ |0.0018|± |0.0018|
|
119 |
+
|gpqa_extended_n_shot | 2|none | 0|acc |↑ |0.2473|± |0.0185|
|
120 |
+
| | |none | 0|acc_norm |↑ |0.2473|± |0.0185|
|
121 |
+
|gpqa_extended_zeroshot | 1|none | 0|acc |↑ |0.3681|± |0.0207|
|
122 |
+
| | |none | 0|acc_norm |↑ |0.3681|± |0.0207|
|
123 |
+
|gpqa_main_cot_n_shot | 2|flexible-extract| 0|exact_match|↑ |0.2232|± |0.0197|
|
124 |
+
| | |strict-match | 0|exact_match|↑ |0.0022|± |0.0022|
|
125 |
+
|gpqa_main_cot_zeroshot | 1|flexible-extract| 0|exact_match|↑ |0.1205|± |0.0154|
|
126 |
+
| | |strict-match | 0|exact_match|↑ |0.0022|± |0.0022|
|
127 |
+
|gpqa_main_generative_n_shot | 2|flexible-extract| 0|exact_match|↑ |0.2701|± |0.0210|
|
128 |
+
| | |strict-match | 0|exact_match|↑ |0.0112|± |0.0050|
|
129 |
+
|gpqa_main_n_shot | 2|none | 0|acc |↑ |0.2701|± |0.0210|
|
130 |
+
| | |none | 0|acc_norm |↑ |0.2701|± |0.0210|
|
131 |
+
|gpqa_main_zeroshot | 1|none | 0|acc |↑ |0.3795|± |0.0230|
|
132 |
+
| | |none | 0|acc_norm |↑ |0.3795|± |0.0230|
|
133 |
+
|
134 |
+
|
135 |
Sonnet sampling for style impressions:
|
136 |
```
|
137 |
Upon the Dawn's Awakening
|