Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ widget:
|
|
12 |
|
13 |
# Qwen1.5-0.5B-Chat with EPFL DPO fine-tuning
|
14 |
|
15 |
-
Qwen1.5-0.5B-Chat DPO fine-tuned on the
|
16 |
|
17 |
## Model Details
|
18 |
|
@@ -29,7 +29,7 @@ answer open-ended and multiple-choice questions from Orca Math dataset
|
|
29 |
|
30 |
### Training Data
|
31 |
|
32 |
-
|
33 |
|
34 |
### Training Procedure
|
35 |
|
|
|
12 |
|
13 |
# Qwen1.5-0.5B-Chat with EPFL DPO fine-tuning
|
14 |
|
15 |
+
Qwen1.5-0.5B-Chat DPO fine-tuned on the Orca Math dataset that consists of ~200K grade school math word problems
|
16 |
|
17 |
## Model Details
|
18 |
|
|
|
29 |
|
30 |
### Training Data
|
31 |
|
32 |
+
HuggingFace dataset : microsoft/orca-math-word-problems-200k
|
33 |
|
34 |
### Training Procedure
|
35 |
|