|
wandb: https://wandb.ai/eleutherai/pythia-rlhf/runs/6y83ekqy?workspace=user-yongzx |
|
|
|
Model Evals |
|
| Task |Version|Filter| Metric |Value | |Stderr| |
|
|--------------|-------|------|----------|-----:|---|-----:| |
|
|arc_challenge |Yaml |none |acc |0.2526|± |0.0127| |
|
| | |none |acc_norm |0.2773|± |0.0131| |
|
|arc_easy |Yaml |none |acc |0.5791|± |0.0101| |
|
| | |none |acc_norm |0.4912|± |0.0103| |
|
|lambada_openai|Yaml |none |perplexity|7.0516|± |0.1979| |
|
| | |none |acc |0.5684|± |0.0069| |
|
|logiqa |Yaml |none |acc |0.2166|± |0.0162| |
|
| | |none |acc_norm |0.2919|± |0.0178| |
|
|piqa |Yaml |none |acc |0.7176|± |0.0105| |
|
| | |none |acc_norm |0.6964|± |0.0107| |
|
|sciq |Yaml |none |acc |0.8460|± |0.0114| |
|
| | |none |acc_norm |0.7700|± |0.0133| |
|
|winogrande |Yaml |none |acc |0.5399|± |0.0140| |
|
|wsc |Yaml |none |acc |0.3654|± |0.0474| |