tFINE-900m-e16-d32-instruct_2e / lm_harness_results.md
pszemraj's picture
Create lm_harness_results.md
9ff8029 verified

Quick eval

Quick eval for: BEE-spoke-data/tFINE-900m-e16-d32-instruct_2e

hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-instruct_2e,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks Version Filter n-shot Metric Value Stderr
boolq 2 none 0 acc 0.6254 ± 0.0085
openbookqa 1 none 0 acc 0.1520 ± 0.0161
none 0 acc_norm 0.3100 ± 0.0207
piqa 1 none 0 acc 0.6072 ± 0.0114
none 0 acc_norm 0.5996 ± 0.0114
social_iqa 0 none 0 acc 0.4212 ± 0.0112
tinyArc 0 none 25 acc_norm 0.2998 ± N/A
tinyGSM8k 0 flexible-extract 5 exact_match 0.0605 ± N/A
strict-match 5 exact_match 0.0432 ± N/A
tinyHellaswag 0 none 10 acc_norm 0.2969 ± N/A
tinyMMLU 0 none 0 acc_norm 0.3120 ± N/A
winogrande 1 none 0 acc 0.4964 ± 0.0141