Spaces:
Runtime error
Runtime error
File size: 895 Bytes
5790299 07c9dc6 5790299 07c9dc6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
---
title: PoCLeaderboard
emoji: 🏆
colorFrom: green
colorTo: pink
sdk: gradio
sdk_version: 5.4.0
app_file: app.py
pinned: false
license: mit
short_description: Example Leaderboard
---
This Space provides an interactive leaderboard for comparing language model performance across various benchmarks and custom tasks.
## Features
- Automated model evaluation using lm-evaluation-harness
- Support for standard and custom benchmarks
- Interactive visualization of results
- Daily automated evaluations
- Easy submission of new models and custom tasks
## Usage
1. Visit the Space to view current leaderboard
2. Submit new models for evaluation
3. Create custom evaluation tasks
4. Track performance trends over time
## Custom Task Format
```json
{
"examples": [
{
"input": "question or prompt",
"ideal": "expected answer",
"metrics": ["accuracy", "f1"]
}
]
}
```
|