pansophic commited on
Commit
04bf1c7
1 Parent(s): 32b7359

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Task Performance Metrics
2
+
3
+ The following table displays the performance metrics for various tasks, including accuracy (`acc`) and normalized accuracy (`acc_norm`). The 'Value' column represents the accuracy, and 'Stderr' indicates the standard error for each metric.
4
+
5
+ | **Task** | **Version** | **Metric** | **Value** | **Stderr** |
6
+ |----------------|-------------|------------|-----------|------------|
7
+ | arc_challenge | 0 | acc | 0.4334 | ± 0.0145 |
8
+ | | | acc_norm | 0.4394 | ± 0.0145 |
9
+ |----------------|-------------|------------|-----------|------------|
10
+ | arc_easy | 0 | acc | 0.6974 | ± 0.0094 |
11
+ | | | acc_norm | 0.6170 | ± 0.0100 |
12
+ |----------------|-------------|------------|-----------|------------|
13
+ | boolq | 1 | acc | 0.8171 | ± 0.0068 |
14
+ |----------------|-------------|------------|-----------|------------|
15
+ | hellaswag | 0 | acc | 0.5770 | ± 0.0049 |
16
+ | | | acc_norm | 0.7391 | ± 0.0044 |
17
+ |----------------|-------------|------------|-----------|------------|
18
+ | openbookqa | 0 | acc | 0.2800 | ± 0.0201 |
19
+ | | | acc_norm | 0.3760 | ± 0.0217 |
20
+ |----------------|-------------|------------|-----------|------------|
21
+ | piqa | 0 | acc | 0.7797 | ± 0.0097 |
22
+ | | | acc_norm | 0.7622 | ± 0.0099 |
23
+ |----------------|-------------|------------|-----------|------------|
24
+ | toxigen | 0 | acc | 0.4777 | ± 0.0163 |
25
+ | | | acc_norm | 0.4340 | ± 0.0162 |
26
+ |----------------|-------------|------------|-----------|------------|
27
+ | winogrande | 0 | acc | 0.6322 | ± 0.0136 |
28
+ |----------------|-------------|------------|-----------|------------|
29
+ | gsm8k | 0 | acc | 0.0144 | ± 0.0033 |