Files changed (1) hide show
  1. README.md +118 -1
README.md CHANGED
@@ -1,6 +1,109 @@
1
  ---
2
- inference: false
3
  license: other
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  <!-- header start -->
@@ -268,3 +371,17 @@ Please cite the repo if you use the data or code in this repo.
268
  ## Disclaimer
269
 
270
  The resources, including code, data, and model weights, associated with this project are restricted for academic research purposes only and cannot be used for commercial purposes. The content produced by any version of WizardLM is influenced by uncontrollable variables such as randomness, and therefore, the accuracy of the output cannot be guaranteed by this project. This project does not accept any legal liability for the content of the model output, nor does it assume responsibility for any losses incurred due to the use of associated resources and output results.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  license: other
3
+ inference: false
4
+ model-index:
5
+ - name: wizardLM-13B-1.0-fp16
6
+ results:
7
+ - task:
8
+ type: text-generation
9
+ name: Text Generation
10
+ dataset:
11
+ name: AI2 Reasoning Challenge (25-Shot)
12
+ type: ai2_arc
13
+ config: ARC-Challenge
14
+ split: test
15
+ args:
16
+ num_few_shot: 25
17
+ metrics:
18
+ - type: acc_norm
19
+ value: 57.25
20
+ name: normalized accuracy
21
+ source:
22
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/wizardLM-13B-1.0-fp16
23
+ name: Open LLM Leaderboard
24
+ - task:
25
+ type: text-generation
26
+ name: Text Generation
27
+ dataset:
28
+ name: HellaSwag (10-Shot)
29
+ type: hellaswag
30
+ split: validation
31
+ args:
32
+ num_few_shot: 10
33
+ metrics:
34
+ - type: acc_norm
35
+ value: 80.88
36
+ name: normalized accuracy
37
+ source:
38
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/wizardLM-13B-1.0-fp16
39
+ name: Open LLM Leaderboard
40
+ - task:
41
+ type: text-generation
42
+ name: Text Generation
43
+ dataset:
44
+ name: MMLU (5-Shot)
45
+ type: cais/mmlu
46
+ config: all
47
+ split: test
48
+ args:
49
+ num_few_shot: 5
50
+ metrics:
51
+ - type: acc
52
+ value: 52.9
53
+ name: accuracy
54
+ source:
55
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/wizardLM-13B-1.0-fp16
56
+ name: Open LLM Leaderboard
57
+ - task:
58
+ type: text-generation
59
+ name: Text Generation
60
+ dataset:
61
+ name: TruthfulQA (0-shot)
62
+ type: truthful_qa
63
+ config: multiple_choice
64
+ split: validation
65
+ args:
66
+ num_few_shot: 0
67
+ metrics:
68
+ - type: mc2
69
+ value: 50.55
70
+ source:
71
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/wizardLM-13B-1.0-fp16
72
+ name: Open LLM Leaderboard
73
+ - task:
74
+ type: text-generation
75
+ name: Text Generation
76
+ dataset:
77
+ name: Winogrande (5-shot)
78
+ type: winogrande
79
+ config: winogrande_xl
80
+ split: validation
81
+ args:
82
+ num_few_shot: 5
83
+ metrics:
84
+ - type: acc
85
+ value: 74.11
86
+ name: accuracy
87
+ source:
88
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/wizardLM-13B-1.0-fp16
89
+ name: Open LLM Leaderboard
90
+ - task:
91
+ type: text-generation
92
+ name: Text Generation
93
+ dataset:
94
+ name: GSM8k (5-shot)
95
+ type: gsm8k
96
+ config: main
97
+ split: test
98
+ args:
99
+ num_few_shot: 5
100
+ metrics:
101
+ - type: acc
102
+ value: 13.87
103
+ name: accuracy
104
+ source:
105
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/wizardLM-13B-1.0-fp16
106
+ name: Open LLM Leaderboard
107
  ---
108
 
109
  <!-- header start -->
 
371
  ## Disclaimer
372
 
373
  The resources, including code, data, and model weights, associated with this project are restricted for academic research purposes only and cannot be used for commercial purposes. The content produced by any version of WizardLM is influenced by uncontrollable variables such as randomness, and therefore, the accuracy of the output cannot be guaranteed by this project. This project does not accept any legal liability for the content of the model output, nor does it assume responsibility for any losses incurred due to the use of associated resources and output results.
374
+
375
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
376
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TheBloke__wizardLM-13B-1.0-fp16)
377
+
378
+ | Metric |Value|
379
+ |---------------------------------|----:|
380
+ |Avg. |54.93|
381
+ |AI2 Reasoning Challenge (25-Shot)|57.25|
382
+ |HellaSwag (10-Shot) |80.88|
383
+ |MMLU (5-Shot) |52.90|
384
+ |TruthfulQA (0-shot) |50.55|
385
+ |Winogrande (5-shot) |74.11|
386
+ |GSM8k (5-shot) |13.87|
387
+