Open-Arabic-LLM-Leaderboard-v1

Running

App Files Files Community

Ali-C137 commited on Apr 25, 2024

Commit

c5df7b5

verified ·

1 Parent(s): 08824fe

Update src/about.py

Browse files

Files changed (1) hide show

src/about.py +11 -2

src/about.py CHANGED Viewed

@@ -42,8 +42,13 @@ TITLE = """<h1 align="center" id="space-title">Open Arabic LLM Leaderboard</h1>"
 INTRODUCTION_TEXT = """
 🚀 The Open Arabic LLM Leaderboard : Objectively evaluates and compare the performance of Arabic Large Language Models (LLMs).
-When you submit a model on the "Submit here!" page, it is automatically evaluated on a set of benchmarks. The GPU used for evaluation is operated with the support of  __[Technology Innovation Institute (TII)](https://www.tii.ae/)__.
 The datasets used for evaluation consists of datasets that are Arabic Native like the `AlGhafa` benchmark from [TII](https://www.tii.ae/) and `ACVA` benchmark from [FreedomIntelligence](https://huggingface.co/FreedomIntelligence) to assess reasoning, language understanding, commonsense, and more.
 More details about the benchmarks and the evaluation process is provided on the “About” page.
 """
@@ -55,10 +60,14 @@ While outstanding LLM models are being released competitively, most of them are
 ## Icons & Model types
 🟢 : `pretrained` or `continuously pretrained`
 🔶 : `fine-tuned on domain-specific datasets`
 💬 : `chat models (RLHF, DPO, ORPO, ...)`
 🤝 : `base merges and moerges`
 If the icon is "?", it indicates that there is insufficient information about the model.
 Please provide information about the model through an issue! 🤩
@@ -177,7 +186,7 @@ CITATION_BUTTON_TEXT = r"""
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
-@misc{datatrove,
   author = {Clémentine, Fourrier, and Nathan, Habib and Wolf, Thomas},
   title = {LightEval: A lightweight framework for LLM evaluation},
   year = {2024},

 INTRODUCTION_TEXT = """
 🚀 The Open Arabic LLM Leaderboard : Objectively evaluates and compare the performance of Arabic Large Language Models (LLMs).
+When you submit a model on the "Submit here!" page, it is automatically evaluated on a set of benchmarks.
+The GPU used for evaluation is operated with the support of  __[Technology Innovation Institute (TII)](https://www.tii.ae/)__.
 The datasets used for evaluation consists of datasets that are Arabic Native like the `AlGhafa` benchmark from [TII](https://www.tii.ae/) and `ACVA` benchmark from [FreedomIntelligence](https://huggingface.co/FreedomIntelligence) to assess reasoning, language understanding, commonsense, and more.
 More details about the benchmarks and the evaluation process is provided on the “About” page.
 """
 ## Icons & Model types
 🟢 : `pretrained` or `continuously pretrained`
 🔶 : `fine-tuned on domain-specific datasets`
 💬 : `chat models (RLHF, DPO, ORPO, ...)`
 🤝 : `base merges and moerges`
 If the icon is "?", it indicates that there is insufficient information about the model.
 Please provide information about the model through an issue! 🤩
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
+@misc{lighteval,
   author = {Clémentine, Fourrier, and Nathan, Habib and Wolf, Thomas},
   title = {LightEval: A lightweight framework for LLM evaluation},
   year = {2024},