Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,6 @@ This reward model is finetuned from [gemma-2b-it](https://huggingface.co/google/
|
|
18 |
## Evaluation
|
19 |
We evaluate GRM 2B on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), which achieves the **SOTA 2B Bradley–Terry model** Performance.
|
20 |
|
21 |
-
**Note: Please download the `model.py` file from this repository to ensure the structure is loaded correctly and verify that the `v_head` is properly initialized.**
|
22 |
|
23 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
24 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
@@ -32,6 +31,7 @@ We evaluate GRM 2B on the [reward model benchmark](https://huggingface.co/spaces
|
|
32 |
|
33 |
|
34 |
## Usage
|
|
|
35 |
```
|
36 |
import torch
|
37 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
|
|
18 |
## Evaluation
|
19 |
We evaluate GRM 2B on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), which achieves the **SOTA 2B Bradley–Terry model** Performance.
|
20 |
|
|
|
21 |
|
22 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
23 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
|
|
31 |
|
32 |
|
33 |
## Usage
|
34 |
+
**Note: Please download the `model.py` file from this repository to ensure the structure is loaded correctly and verify that the `v_head` is properly initialized.**
|
35 |
```
|
36 |
import torch
|
37 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|