Spaces:
Runtime error
Runtime error
| title: BenchBench Leaderboad | |
| emoji: 🏋️♂️ | |
| colorFrom: gray | |
| colorTo: blue | |
| sdk: streamlit | |
| sdk_version: 1.36.0 | |
| app_file: app.py | |
| pinned: true | |
| license: apache-2.0 | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| ``` | |
| @misc{perlitz2024benchmarkagreementtestingright, | |
| title={Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation}, | |
| author={Yotam Perlitz and Ariel Gera and Ofir Arviv and Asaf Yehudai and Elron Bandel and Eyal Shnarch and Michal Shmueli-Scheuer and Leshem Choshen}, | |
| year={2024}, | |
| eprint={2407.13696}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2407.13696}, | |
| } | |
| ``` |