Update README.md
Browse files
README.md
CHANGED
|
@@ -114,32 +114,10 @@ mera-mix-4x7B achieves the score of 75.91 on the OpenLLM Eval and compares well
|
|
| 114 |
|
| 115 |
You can try the model with the [Mera Mixture Chat](https://huggingface.co/spaces/meraGPT/mera-mixture-chat).
|
| 116 |
|
| 117 |
-
<!--
|
| 118 |
-
## OpenLLM Eval
|
| 119 |
-
|
| 120 |
-
| Model | ARC |HellaSwag|MMLU |TruthfulQA|Winogrande|GSM8K|Average|
|
| 121 |
-
|-------------------------------------------------------------|----:|--------:|----:|---------:|---------:|----:|------:|
|
| 122 |
-
|[mera-mix-4x7B](https://huggingface.co/meraGPT/mera-mix-4x7B)|72.01| 88.82|63.67| 77.45| 84.61|71.65| 76.37|
|
| 123 |
-
|
| 124 |
-
Raw eval results are available at this [gist](https://gist.github.com/codelion/78f88333230801c9bbaa6fc22078d820)
|
| 125 |
-
-->
|
| 126 |
-
|
| 127 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 128 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_meraGPT__mera-mix-4x7B)
|
| 129 |
-
|
| 130 |
-
| Metric |Value|
|
| 131 |
-
|---------------------------------|----:|
|
| 132 |
-
|Avg. |75.91|
|
| 133 |
-
|AI2 Reasoning Challenge (25-Shot)|72.95|
|
| 134 |
-
|HellaSwag (10-Shot) |89.17|
|
| 135 |
-
|MMLU (5-Shot) |64.44|
|
| 136 |
-
|TruthfulQA (0-shot) |77.17|
|
| 137 |
-
|Winogrande (5-shot) |85.64|
|
| 138 |
-
|GSM8k (5-shot) |66.11|
|
| 139 |
-
|
| 140 |
In addition, to the official Open LLM Leaderboard, the results on OpenLLM Eval have been validated by [others as well (76.59)](https://github.com/saucam/model_evals/tree/main?tab=readme-ov-file#model-eval-results).
|
| 141 |
|
| 142 |
Our own initial eval is available [here (76.37)](https://gist.github.com/codelion/78f88333230801c9bbaa6fc22078d820).
|
|
|
|
| 143 |
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 144 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_meraGPT__mera-mix-4x7B)
|
| 145 |
|
|
|
|
| 114 |
|
| 115 |
You can try the model with the [Mera Mixture Chat](https://huggingface.co/spaces/meraGPT/mera-mixture-chat).
|
| 116 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 117 |
In addition, to the official Open LLM Leaderboard, the results on OpenLLM Eval have been validated by [others as well (76.59)](https://github.com/saucam/model_evals/tree/main?tab=readme-ov-file#model-eval-results).
|
| 118 |
|
| 119 |
Our own initial eval is available [here (76.37)](https://gist.github.com/codelion/78f88333230801c9bbaa6fc22078d820).
|
| 120 |
+
|
| 121 |
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 122 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_meraGPT__mera-mix-4x7B)
|
| 123 |
|