metadata
base_model:
- EleutherAI/pythia-160m
Model Description
This is the pythia-160m from EleutherAI re-uploaded as an exercise.
Evaluation Results
According to project requirement, we used lm-evalutation-harness from EleutherAI to evaluate pythia-160m on the 'Hellaswag' benchmark.
Hellaswag
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
hellaswag | 1 | none | 0 | acc | ↑ | 0.2872 | ± | 0.0045 |
none | 0 | acc_norm | ↑ | 0.3082 | ± | 0.0046 |