Quick eval
Quick eval for: BEE-spoke-data/tFINE-900m-e16-d32-instruct_2e
hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-instruct_2e,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
boolq | 2 | none | 0 | acc | ↑ | 0.6254 | ± | 0.0085 |
openbookqa | 1 | none | 0 | acc | ↑ | 0.1520 | ± | 0.0161 |
none | 0 | acc_norm | ↑ | 0.3100 | ± | 0.0207 | ||
piqa | 1 | none | 0 | acc | ↑ | 0.6072 | ± | 0.0114 |
none | 0 | acc_norm | ↑ | 0.5996 | ± | 0.0114 | ||
social_iqa | 0 | none | 0 | acc | ↑ | 0.4212 | ± | 0.0112 |
tinyArc | 0 | none | 25 | acc_norm | ↑ | 0.2998 | ± | N/A |
tinyGSM8k | 0 | flexible-extract | 5 | exact_match | ↑ | 0.0605 | ± | N/A |
strict-match | 5 | exact_match | ↑ | 0.0432 | ± | N/A | ||
tinyHellaswag | 0 | none | 10 | acc_norm | ↑ | 0.2969 | ± | N/A |
tinyMMLU | 0 | none | 0 | acc_norm | ↑ | 0.3120 | ± | N/A |
winogrande | 1 | none | 0 | acc | ↑ | 0.4964 | ± | 0.0141 |