leaderboard-pr-bot's picture
Adding Evaluation Results
2082d3c
|
raw
history blame
914 Bytes
metadata
datasets:
  - DanFosing/wizardlm-vicuna-guanaco-uncensored
language:
  - en
tags:
  - text generation
  - conversational

image/png

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 44.2
ARC (25-shot) 53.24
HellaSwag (10-shot) 79.13
MMLU (5-shot) 46.65
TruthfulQA (0-shot) 42.59
Winogrande (5-shot) 75.14
GSM8K (5-shot) 7.05
DROP (3-shot) 5.63