Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -16,4 +16,17 @@ can produce NC-17+ content (mostly from Shinen).
16
 
17
  GPT-R merge variant will be released if it adds
18
  value to this already "kitchen sink" level of
19
- merging.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  GPT-R merge variant will be released if it adds
18
  value to this already "kitchen sink" level of
19
+ merging.
20
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
21
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_digitous__Javelin-GPTJ)
22
+
23
+ | Metric | Value |
24
+ |-----------------------|---------------------------|
25
+ | Avg. | 35.16 |
26
+ | ARC (25-shot) | 42.66 |
27
+ | HellaSwag (10-shot) | 70.45 |
28
+ | MMLU (5-shot) | 26.2 |
29
+ | TruthfulQA (0-shot) | 36.08 |
30
+ | Winogrande (5-shot) | 64.17 |
31
+ | GSM8K (5-shot) | 1.82 |
32
+ | DROP (3-shot) | 4.77 |