leaderboard-pr-bot commited on
Commit
f94e158
1 Parent(s): 6252eb7

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +106 -0
README.md CHANGED
@@ -110,6 +110,98 @@ model-index:
110
  source:
111
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b
112
  name: Open LLM Leaderboard
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  ---
114
 
115
  # orca_mini_v3_7b
@@ -290,3 +382,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
290
  |Winogrande (5-shot) |74.27|
291
  |GSM8k (5-shot) | 7.13|
292
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
  source:
111
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b
112
  name: Open LLM Leaderboard
113
+ - task:
114
+ type: text-generation
115
+ name: Text Generation
116
+ dataset:
117
+ name: IFEval (0-Shot)
118
+ type: HuggingFaceH4/ifeval
119
+ args:
120
+ num_few_shot: 0
121
+ metrics:
122
+ - type: inst_level_strict_acc and prompt_level_strict_acc
123
+ value: 28.21
124
+ name: strict accuracy
125
+ source:
126
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b
127
+ name: Open LLM Leaderboard
128
+ - task:
129
+ type: text-generation
130
+ name: Text Generation
131
+ dataset:
132
+ name: BBH (3-Shot)
133
+ type: BBH
134
+ args:
135
+ num_few_shot: 3
136
+ metrics:
137
+ - type: acc_norm
138
+ value: 17.84
139
+ name: normalized accuracy
140
+ source:
141
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b
142
+ name: Open LLM Leaderboard
143
+ - task:
144
+ type: text-generation
145
+ name: Text Generation
146
+ dataset:
147
+ name: MATH Lvl 5 (4-Shot)
148
+ type: hendrycks/competition_math
149
+ args:
150
+ num_few_shot: 4
151
+ metrics:
152
+ - type: exact_match
153
+ value: 0.3
154
+ name: exact match
155
+ source:
156
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b
157
+ name: Open LLM Leaderboard
158
+ - task:
159
+ type: text-generation
160
+ name: Text Generation
161
+ dataset:
162
+ name: GPQA (0-shot)
163
+ type: Idavidrein/gpqa
164
+ args:
165
+ num_few_shot: 0
166
+ metrics:
167
+ - type: acc_norm
168
+ value: 0.0
169
+ name: acc_norm
170
+ source:
171
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b
172
+ name: Open LLM Leaderboard
173
+ - task:
174
+ type: text-generation
175
+ name: Text Generation
176
+ dataset:
177
+ name: MuSR (0-shot)
178
+ type: TAUR-Lab/MuSR
179
+ args:
180
+ num_few_shot: 0
181
+ metrics:
182
+ - type: acc_norm
183
+ value: 22.71
184
+ name: acc_norm
185
+ source:
186
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b
187
+ name: Open LLM Leaderboard
188
+ - task:
189
+ type: text-generation
190
+ name: Text Generation
191
+ dataset:
192
+ name: MMLU-PRO (5-shot)
193
+ type: TIGER-Lab/MMLU-Pro
194
+ config: main
195
+ split: test
196
+ args:
197
+ num_few_shot: 5
198
+ metrics:
199
+ - type: acc
200
+ value: 12.04
201
+ name: accuracy
202
+ source:
203
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b
204
+ name: Open LLM Leaderboard
205
  ---
206
 
207
  # orca_mini_v3_7b
 
382
  |Winogrande (5-shot) |74.27|
383
  |GSM8k (5-shot) | 7.13|
384
 
385
+
386
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
387
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_pankajmathur__orca_mini_v3_7b)
388
+
389
+ | Metric |Value|
390
+ |-------------------|----:|
391
+ |Avg. |13.52|
392
+ |IFEval (0-Shot) |28.21|
393
+ |BBH (3-Shot) |17.84|
394
+ |MATH Lvl 5 (4-Shot)| 0.30|
395
+ |GPQA (0-shot) | 0.00|
396
+ |MuSR (0-shot) |22.71|
397
+ |MMLU-PRO (5-shot) |12.04|
398
+