general-eval-card / README.md
Avijit Ghosh
Added about page
c417f2d
---
title: AI Evaluation Dashboard
emoji: πŸ“Š
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 3000
---
# AI Evaluation Dashboard
This repository is a Next.js application for viewing and authoring AI evaluations. It provides a comprehensive platform for documenting and sharing AI system evaluations across multiple dimensions including capabilities and risks.
## Project Goals
The AI Evaluation Dashboard aims to:
- **Standardize AI evaluation reporting** across different AI systems and models
- **Facilitate transparency** by providing detailed evaluation cards for AI systems
- **Enable comparative analysis** of AI capabilities and risks
- **Support research and policy** by consolidating evaluation data in an accessible format
- **Promote responsible AI development** through comprehensive risk assessment
## For External Collaborators
### Making Changes to Evaluation Categories and Schema
All evaluation categories, form fields, and data structures are centrally managed in the `schema/` folder. **This is the primary location for making structural changes to the evaluation framework.**
Key schema files:
- **`schema/evaluation-schema.json`** - Defines all evaluation categories (capabilities and risks)
- **`schema/output-schema.json`** - Defines the complete data structure for evaluation outputs
- **`schema/system-info-schema.json`** - Defines form field options for system information
- **`schema/category-details.json`** - Contains detailed descriptions and criteria for each category
- **`schema/form-hints.json`** - Provides help text and guidance for form fields
### Standards and Frameworks Used
The evaluation framework is based on established standards:
- **Risk categories** are derived from **NIST AI 600-1** (AI Risk Management Framework)
- **Capability categories** are based on the **OECD AI Classification Framework**
This ensures consistency with international AI governance standards and facilitates interoperability with other evaluation systems.
### Contributing Evaluation Data
Evaluation data files are stored in `public/evaluations/` as JSON files. Each file represents a complete evaluation of an AI system and must conform to the schema defined in `schema/output-schema.json`.
To add a new evaluation:
1. Create a new JSON file in `public/evaluations/`
2. Follow the structure defined in `schema/output-schema.json`
3. Ensure all required fields are populated
4. Validate against the schema before submission
### Development Setup
## Run locally
Install dependencies and run the dev server:
```bash
npm ci
npm run dev
```
Build for production and run:
```bash
npm ci
npm run build
NODE_ENV=production PORT=3000 npm run start
```
## Docker (recommended for Hugging Face Spaces)
A `Dockerfile` is included for deploying this app as a dynamic service on Hugging Face Spaces (Docker runtime).
Build the image locally:
```bash
docker build -t ai-eval-dashboard .
```
Run the container (expose port 3000):
```bash
docker run -p 3000:3000 -e HF_TOKEN="$HF_TOKEN" ai-eval-dashboard
```
Visit `http://localhost:3000` to verify.
### Deploy to Hugging Face Spaces
1. Create a new Space at https://huggingface.co/new-space and choose **Docker** as the runtime.
2. Push this repository to the Space Git (or upload files through the UI). The Space will build the Docker image using the included `Dockerfile` and serve your app on port 3000.
Notes:
- If your build needs native dependencies (e.g. `sharp`), the Docker image may require extra apt packages; update the Dockerfile accordingly.