Spaces:

ozayezerceli
/

PoCLeaderboard

Runtime error

File size: 895 Bytes

---
title: PoCLeaderboard
emoji: 🏆
colorFrom: green
colorTo: pink
sdk: gradio
sdk_version: 5.4.0
app_file: app.py
pinned: false
license: mit
short_description: Example Leaderboard
---
This Space provides an interactive leaderboard for comparing language model performance across various benchmarks and custom tasks.

## Features
- Automated model evaluation using lm-evaluation-harness
- Support for standard and custom benchmarks
- Interactive visualization of results
- Daily automated evaluations
- Easy submission of new models and custom tasks

## Usage
1. Visit the Space to view current leaderboard
2. Submit new models for evaluation
3. Create custom evaluation tasks
4. Track performance trends over time

## Custom Task Format
```json
{
  "examples": [
    {
      "input": "question or prompt",
      "ideal": "expected answer",
      "metrics": ["accuracy", "f1"]
    }
  ]
}
```