File size: 2,134 Bytes
4342d14
 
2495528
 
efe5d32
4342d14
2495528
 
 
 
 
 
 
c4fd296
a1e3ec1
 
 
 
 
 
 
 
 
 
2495528
 
 
efe5d32
a1e3ec1
 
2495528
0a6556e
a1e3ec1
48dc3f2
 
2495528
04f31d8
2495528
 
733cd4a
 
 
 
 
 
2495528
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: apache-2.0
language:
- en
pipeline_tag: text-classification
---



# **ReasonEval-7B Model Card**

## Model Description

`ReasonEval-7B` is a 7B parameter decoder-only language model fine-tuned from [`WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1). Given a mathematical problem and the solution, `ReasonEval-7B` assesses the problem-solving process in a step-by-step format from the following perspectives:
- **Validity**: The step contains no mistakes in calculation and logic.
- **Redundancy**: The step lacks utility in solving the problem but is still valid.

With ReasonEval, you can

- 📏 quantify the quality of reasoning steps free of human or close-source models.

- 🤖 find the potential invalid or redundant steps in the solutions even with the correct results.

- 🛠️ select high-quality training data for downstream tasks (e.g., fine-tuning).    

## Model Details

* **Model type**: `ReasonEval-7B`'s architecture is identical to [`WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1), except that the
classification head for next-token prediction is replaced with a classification head for outputting the
possibilities of each class of reasong steps.
* **Language(s)**: English
* **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy](https://arxiv.org/pdf/2404.05692.pdf)
* **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
* **Finetuned from model**: [https://huggingface.co/WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
* **Fine-tuning Data**: [PRM800K](https://github.com/openai/prm800k)

For detailed instructions on how to use the ReasonEval-7B model, visit our GitHub repository at [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval).
## How to Cite
```bibtex
@article{xia2024evaluating,
        title={Evaluating Mathematical Reasoning Beyond Accuracy}, 
        author={Xia, Shijie and Li, Xuefeng and Liu, Yixin and Wu, Tongshuang and Liu, Pengfei},
        journal={arXiv preprint arXiv:2404.05692},
        year={2024},
}
```