Update README.md
Browse files
README.md
CHANGED
@@ -10,16 +10,11 @@ pipeline_tag: text-classification
|
|
10 |
|
11 |
## Model Description
|
12 |
|
13 |
-
`ReasonEval-34B` is a 34B parameter decoder-only language model fine-tuned from [`llemma_34b`](https://huggingface.co/EleutherAI/llemma_34b).
|
14 |
-
|
15 |
-
<p align="center">
|
16 |
-
<img src="introduction.jpg" alt="error" style="width:95%;">
|
17 |
-
</p>
|
18 |
-
|
19 |
-
`ReasonEval-34B` assesses the problem-solving process in a step-by-step format from the following perspectives:
|
20 |
- **Validity**: The step contains no mistakes in calculation and logic.
|
21 |
- **Redundancy**: The step lacks utility in solving the problem but is still valid.
|
22 |
|
|
|
23 |
With ReasonEval, you can
|
24 |
|
25 |
- 📏 quantify the quality of reasoning steps free of human or close-source models.
|
@@ -34,12 +29,12 @@ With ReasonEval, you can
|
|
34 |
classification head for next-token prediction is replaced with a classification head for outputting the
|
35 |
possibilities of each class of reasong steps.
|
36 |
* **Language(s)**: English
|
37 |
-
* **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy](
|
38 |
* **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
|
39 |
* **Finetuned from model**: [https://huggingface.co/EleutherAI/llemma_34b](https://huggingface.co/EleutherAI/llemma_34b)
|
40 |
* **Fine-tuning Data**: [PRM800K](https://github.com/openai/prm800k)
|
41 |
|
42 |
-
For detailed instructions on how to use the ReasonEval-34B model, visit our GitHub repository at [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval).
|
43 |
## How to Cite
|
44 |
```bibtex
|
45 |
```
|
|
|
10 |
|
11 |
## Model Description
|
12 |
|
13 |
+
`ReasonEval-34B` is a 34B parameter decoder-only language model fine-tuned from [`llemma_34b`](https://huggingface.co/EleutherAI/llemma_34b). Given a mathematical problem and the solution, `ReasonEval-7B` assesses the problem-solving process in a step-by-step format from the following perspectives:
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
- **Validity**: The step contains no mistakes in calculation and logic.
|
15 |
- **Redundancy**: The step lacks utility in solving the problem but is still valid.
|
16 |
|
17 |
+
|
18 |
With ReasonEval, you can
|
19 |
|
20 |
- 📏 quantify the quality of reasoning steps free of human or close-source models.
|
|
|
29 |
classification head for next-token prediction is replaced with a classification head for outputting the
|
30 |
possibilities of each class of reasong steps.
|
31 |
* **Language(s)**: English
|
32 |
+
* **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy]()
|
33 |
* **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
|
34 |
* **Finetuned from model**: [https://huggingface.co/EleutherAI/llemma_34b](https://huggingface.co/EleutherAI/llemma_34b)
|
35 |
* **Fine-tuning Data**: [PRM800K](https://github.com/openai/prm800k)
|
36 |
|
37 |
+
For detailed instructions on how to use the ReasonEval-34B model, visit our GitHub repository at [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval) and the [paper]() .
|
38 |
## How to Cite
|
39 |
```bibtex
|
40 |
```
|