Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 0 | acc | ↑ | 0.2304 | ± | 0.0123 |
none | 0 | acc_norm | ↑ | 0.2551 | ± | 0.0127 | ||
arc_easy | 1 | none | 0 | acc | ↑ | 0.2559 | ± | 0.0090 |
none | 0 | acc_norm | ↑ | 0.2572 | ± | 0.0090 | ||
boolq | 2 | none | 0 | acc | ↑ | 0.4599 | ± | 0.0087 |
hellaswag | 1 | none | 0 | acc | ↑ | 0.2538 | ± | 0.0043 |
none | 0 | acc_norm | ↑ | 0.2601 | ± | 0.0044 | ||
openbookqa | 1 | none | 0 | acc | ↑ | 0.1580 | ± | 0.0163 |
none | 0 | acc_norm | ↑ | 0.2720 | ± | 0.0199 | ||
piqa | 1 | none | 0 | acc | ↑ | 0.5424 | ± | 0.0116 |
none | 0 | acc_norm | ↑ | 0.5180 | ± | 0.0117 | ||
winogrande | 1 | none | 0 | acc | ↑ | 0.4980 | ± | 0.0141 |
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.