Update README.md
Browse files
README.md
CHANGED
@@ -26,6 +26,18 @@ Chocolatine is the **best-performing 3B model** on the [OpenLLM Leaderboard](htt
|
|
26 |
|
27 |

|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
### MT-Bench-French
|
30 |
|
31 |
Chocolatine-3B-Instruct-DPO-Revised is outperforming GPT-3.5-Turbo on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french) by Bofeng Huang,
|
|
|
26 |
|
27 |

|
28 |
|
29 |
+
|
30 |
+
| Metric |Value|
|
31 |
+
|-------------------|----:|
|
32 |
+
|Avg. |27.63|
|
33 |
+
|IFEval (0-Shot) |56.23|
|
34 |
+
|BBH (3-Shot) |37.16|
|
35 |
+
|MATH Lvl 5 (4-Shot)|14.5|
|
36 |
+
|GPQA (0-shot) |9.62|
|
37 |
+
|MuSR (0-shot) |15.1|
|
38 |
+
|MMLU-PRO (5-shot) |33.21|
|
39 |
+
|
40 |
+
|
41 |
### MT-Bench-French
|
42 |
|
43 |
Chocolatine-3B-Instruct-DPO-Revised is outperforming GPT-3.5-Turbo on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french) by Bofeng Huang,
|