Syed Hasan
commited on
Commit
•
caa9443
1
Parent(s):
da126e2
Update README.md
Browse files
README.md
CHANGED
@@ -62,6 +62,7 @@ Average: 75.9% without mmlu
|
|
62 |
| | |mc2 |77.90|± | 1.37|
|
63 |
|
64 |
### BigBench Reasoning Test
|
|
|
65 |
| Task | Version | Metric | Value | | Stderr|
|
66 |
|------------------------------------------------|---------|-----------------------|--------------|---|-------|
|
67 |
| bigbench_causal_judgement | 0| multiple_choice_grade | 0.6000 | _ | 0.0356 |
|
@@ -84,8 +85,10 @@ Average: 75.9% without mmlu
|
|
84 |
| bigbench_tracking_shuffled_objects_five_objects| 0| multiple_choice_grade | 0.2328 | _ | 0.0120 |
|
85 |
| bigbench_tracking_shuffled_objects_seven_objects| 0| multiple_choice_grade | 0.193714285714| _ | 0.0094 |
|
86 |
| bigbench_tracking_shuffled_objects_three_objects| 0| multiple_choice_grade | 0.593333333333| _ | 0.0284 |
|
87 |
-
|
88 |
Average: 49.08%
|
|
|
|
|
89 |
### Training hyperparameters
|
90 |
|
91 |
The following hyperparameters were used during training:
|
|
|
62 |
| | |mc2 |77.90|± | 1.37|
|
63 |
|
64 |
### BigBench Reasoning Test
|
65 |
+
```
|
66 |
| Task | Version | Metric | Value | | Stderr|
|
67 |
|------------------------------------------------|---------|-----------------------|--------------|---|-------|
|
68 |
| bigbench_causal_judgement | 0| multiple_choice_grade | 0.6000 | _ | 0.0356 |
|
|
|
85 |
| bigbench_tracking_shuffled_objects_five_objects| 0| multiple_choice_grade | 0.2328 | _ | 0.0120 |
|
86 |
| bigbench_tracking_shuffled_objects_seven_objects| 0| multiple_choice_grade | 0.193714285714| _ | 0.0094 |
|
87 |
| bigbench_tracking_shuffled_objects_three_objects| 0| multiple_choice_grade | 0.593333333333| _ | 0.0284 |
|
88 |
+
```
|
89 |
Average: 49.08%
|
90 |
+
|
91 |
+
|
92 |
### Training hyperparameters
|
93 |
|
94 |
The following hyperparameters were used during training:
|