Avijit Ghosh commited on
Commit
c417f2d
·
1 Parent(s): 49d5ba7

Added about page

Browse files
README.md CHANGED
@@ -10,7 +10,49 @@ app_port: 3000
10
 
11
  # AI Evaluation Dashboard
12
 
13
- This repository is a Next.js application for viewing and authoring AI evaluations. It includes demo evaluation fixtures under `public/evaluations/` and a dynamic details page that performs server-side rendering and route-handler based inference.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ## Run locally
16
 
@@ -50,46 +92,7 @@ Visit `http://localhost:3000` to verify.
50
  ### Deploy to Hugging Face Spaces
51
 
52
  1. Create a new Space at https://huggingface.co/new-space and choose **Docker** as the runtime.
53
- 2. Add a secret named `HF_TOKEN` (if you plan to access private or gated models or the Inference API) in the Space settings.
54
- 3. Push this repository to the Space Git (or upload files through the UI). The Space will build the Docker image using the included `Dockerfile` and serve your app on port 3000.
55
 
56
  Notes:
57
- - The app's server may attempt to construct ML pipelines server-side if you use Transformers.js and large models; prefer small/quantized models or use the Hugging Face Inference API instead (see below).
58
- - If your build needs native dependencies (e.g. `sharp`), the Docker image may require extra apt packages; update the Dockerfile accordingly.
59
-
60
- ## Alternative: Use Hugging Face Inference API (avoid hosting model weights)
61
-
62
- If downloading and running model weights inside the Space is impractical (memory/disk limits), modify the server route to proxy requests to the Hugging Face Inference API.
63
-
64
- Example server-side call (Route Handler):
65
-
66
- ```js
67
- const resp = await fetch('https://api-inference.huggingface.co/models/<model-id>', {
68
- method: 'POST',
69
- headers: { Authorization: `Bearer ${process.env.HF_TOKEN}`, 'Content-Type': 'application/json' },
70
- body: JSON.stringify({ inputs: text })
71
- })
72
- const json = await resp.json()
73
- ```
74
-
75
- Store `HF_TOKEN` in the Space secrets and your route will be able to call the API.
76
-
77
- ## Troubleshooting
78
-
79
- - Build fails in Spaces: check the build logs; you may need extra apt packages or to pin Node version.
80
- - Runtime OOM / killed: model is too large for Spaces; use Inference API or smaller models.
81
-
82
- ## What I added
83
-
84
- - `Dockerfile` — multi-stage build for production
85
- - `.dockerignore` — to reduce image size
86
- - Updated `README.md` with Spaces frontmatter and deployment instructions
87
-
88
- If you want, I can:
89
- - Modify the Dockerfile to use Next.js standalone mode for a smaller runtime image.
90
- - Add a small health-check route and a simple `docker-compose.yml` for local testing.
91
-
92
- Which of those would you like next?
93
- npm run build
94
-
95
- Send the contents of the "out" folder to https://huggingface.co/spaces/evaleval/general-eval-card
 
10
 
11
  # AI Evaluation Dashboard
12
 
13
+ This repository is a Next.js application for viewing and authoring AI evaluations. It provides a comprehensive platform for documenting and sharing AI system evaluations across multiple dimensions including capabilities and risks.
14
+
15
+ ## Project Goals
16
+
17
+ The AI Evaluation Dashboard aims to:
18
+ - **Standardize AI evaluation reporting** across different AI systems and models
19
+ - **Facilitate transparency** by providing detailed evaluation cards for AI systems
20
+ - **Enable comparative analysis** of AI capabilities and risks
21
+ - **Support research and policy** by consolidating evaluation data in an accessible format
22
+ - **Promote responsible AI development** through comprehensive risk assessment
23
+
24
+ ## For External Collaborators
25
+
26
+ ### Making Changes to Evaluation Categories and Schema
27
+
28
+ All evaluation categories, form fields, and data structures are centrally managed in the `schema/` folder. **This is the primary location for making structural changes to the evaluation framework.**
29
+
30
+ Key schema files:
31
+ - **`schema/evaluation-schema.json`** - Defines all evaluation categories (capabilities and risks)
32
+ - **`schema/output-schema.json`** - Defines the complete data structure for evaluation outputs
33
+ - **`schema/system-info-schema.json`** - Defines form field options for system information
34
+ - **`schema/category-details.json`** - Contains detailed descriptions and criteria for each category
35
+ - **`schema/form-hints.json`** - Provides help text and guidance for form fields
36
+
37
+ ### Standards and Frameworks Used
38
+
39
+ The evaluation framework is based on established standards:
40
+ - **Risk categories** are derived from **NIST AI 600-1** (AI Risk Management Framework)
41
+ - **Capability categories** are based on the **OECD AI Classification Framework**
42
+
43
+ This ensures consistency with international AI governance standards and facilitates interoperability with other evaluation systems.
44
+
45
+ ### Contributing Evaluation Data
46
+
47
+ Evaluation data files are stored in `public/evaluations/` as JSON files. Each file represents a complete evaluation of an AI system and must conform to the schema defined in `schema/output-schema.json`.
48
+
49
+ To add a new evaluation:
50
+ 1. Create a new JSON file in `public/evaluations/`
51
+ 2. Follow the structure defined in `schema/output-schema.json`
52
+ 3. Ensure all required fields are populated
53
+ 4. Validate against the schema before submission
54
+
55
+ ### Development Setup
56
 
57
  ## Run locally
58
 
 
92
  ### Deploy to Hugging Face Spaces
93
 
94
  1. Create a new Space at https://huggingface.co/new-space and choose **Docker** as the runtime.
95
+ 2. Push this repository to the Space Git (or upload files through the UI). The Space will build the Docker image using the included `Dockerfile` and serve your app on port 3000.
 
96
 
97
  Notes:
98
+ - If your build needs native dependencies (e.g. `sharp`), the Docker image may require extra apt packages; update the Dockerfile accordingly.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app/about/page.tsx ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import { Card, CardContent, CardDescription, CardHeader, CardTitle } from "@/components/ui/card"
2
+ import { Badge } from "@/components/ui/badge"
3
+ import { Separator } from "@/components/ui/separator"
4
+ import { Button } from "@/components/ui/button"
5
+ import Link from "next/link"
6
+ import { ArrowLeft, ExternalLink } from "lucide-react"
7
+
8
+ export default function AboutPage() {
9
+ return (
10
+ <div className="container mx-auto px-4 py-8 max-w-4xl">
11
+ <div className="mb-6">
12
+ <Link href="/">
13
+ <Button variant="ghost" className="mb-4">
14
+ <ArrowLeft className="mr-2 h-4 w-4" />
15
+ Back to Dashboard
16
+ </Button>
17
+ </Link>
18
+ <h1 className="text-4xl font-bold mb-2">About AI Evaluation Dashboard</h1>
19
+ <p className="text-xl text-muted-foreground">
20
+ A comprehensive platform for documenting and sharing AI system evaluations
21
+ </p>
22
+ </div>
23
+
24
+ <div className="grid gap-6">
25
+ <Card>
26
+ <CardHeader>
27
+ <CardTitle>Project Goals</CardTitle>
28
+ <CardDescription>
29
+ Our mission is to advance responsible AI development through transparent evaluation
30
+ </CardDescription>
31
+ </CardHeader>
32
+ <CardContent className="space-y-4">
33
+ <div className="grid gap-3">
34
+ <div className="flex items-start gap-3">
35
+ <div className="w-2 h-2 bg-blue-500 rounded-full mt-2 flex-shrink-0"></div>
36
+ <div>
37
+ <h4 className="font-semibold">Standardize AI Evaluation Reporting</h4>
38
+ <p className="text-sm text-muted-foreground">
39
+ Provide a consistent framework for documenting AI system capabilities and limitations across different models and platforms.
40
+ </p>
41
+ </div>
42
+ </div>
43
+ <div className="flex items-start gap-3">
44
+ <div className="w-2 h-2 bg-green-500 rounded-full mt-2 flex-shrink-0"></div>
45
+ <div>
46
+ <h4 className="font-semibold">Facilitate Transparency</h4>
47
+ <p className="text-sm text-muted-foreground">
48
+ Enable AI developers and researchers to share detailed evaluation results in an accessible, standardized format.
49
+ </p>
50
+ </div>
51
+ </div>
52
+ <div className="flex items-start gap-3">
53
+ <div className="w-2 h-2 bg-purple-500 rounded-full mt-2 flex-shrink-0"></div>
54
+ <div>
55
+ <h4 className="font-semibold">Enable Comparative Analysis</h4>
56
+ <p className="text-sm text-muted-foreground">
57
+ Support side-by-side comparison of AI systems across multiple dimensions including capabilities and risks.
58
+ </p>
59
+ </div>
60
+ </div>
61
+ <div className="flex items-start gap-3">
62
+ <div className="w-2 h-2 bg-orange-500 rounded-full mt-2 flex-shrink-0"></div>
63
+ <div>
64
+ <h4 className="font-semibold">Support Research and Policy</h4>
65
+ <p className="text-sm text-muted-foreground">
66
+ Consolidate evaluation data to inform AI research directions and policy development.
67
+ </p>
68
+ </div>
69
+ </div>
70
+ <div className="flex items-start gap-3">
71
+ <div className="w-2 h-2 bg-red-500 rounded-full mt-2 flex-shrink-0"></div>
72
+ <div>
73
+ <h4 className="font-semibold">Promote Responsible AI Development</h4>
74
+ <p className="text-sm text-muted-foreground">
75
+ Encourage comprehensive risk assessment and responsible deployment practices through structured evaluation.
76
+ </p>
77
+ </div>
78
+ </div>
79
+ </div>
80
+ </CardContent>
81
+ </Card>
82
+
83
+ {/* EvalEval link removed from page body per request; footer includes external link instead */}
84
+
85
+ <Card>
86
+ <CardHeader>
87
+ <CardTitle>Standards and Frameworks</CardTitle>
88
+ <CardDescription>
89
+ Built on established international standards for AI evaluation
90
+ </CardDescription>
91
+ </CardHeader>
92
+ <CardContent className="space-y-4">
93
+ <div className="grid gap-4 md:grid-cols-2">
94
+ <div className="p-4 border rounded-lg">
95
+ <div className="flex items-center gap-2 mb-2">
96
+ <Badge variant="destructive">Risk Assessment</Badge>
97
+ </div>
98
+ <h4 className="font-semibold mb-2">NIST AI 600-1</h4>
99
+ <p className="text-sm text-muted-foreground mb-3">
100
+ Risk categories are derived from the NIST AI Risk Management Framework (AI RMF 1.0), providing a comprehensive approach to identifying and managing AI-related risks.
101
+ </p>
102
+ <Link href="https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf" target="_blank">
103
+ <Button variant="outline" size="sm">
104
+ <ExternalLink className="mr-2 h-3 w-3" />
105
+ Learn More
106
+ </Button>
107
+ </Link>
108
+ </div>
109
+ <div className="p-4 border rounded-lg">
110
+ <div className="flex items-center gap-2 mb-2">
111
+ <Badge variant="default">Capabilities</Badge>
112
+ </div>
113
+ <h4 className="font-semibold mb-2">OECD AI Classification</h4>
114
+ <p className="text-sm text-muted-foreground mb-3">
115
+ Capability categories are based on the OECD AI Classification Framework, ensuring alignment with international standards for AI system categorization.
116
+ </p>
117
+ <Link href="https://www.oecd.org/en/publications/introducing-the-oecd-ai-capability-indicators_be745f04-en/full-report/component-4.html#chapter-d1e230-f85c23a209" target="_blank">
118
+ <Button variant="outline" size="sm">
119
+ <ExternalLink className="mr-2 h-3 w-3" />
120
+ Learn More
121
+ </Button>
122
+ </Link>
123
+ </div>
124
+ </div>
125
+ </CardContent>
126
+ </Card>
127
+
128
+ <Card>
129
+ <CardHeader>
130
+ <CardTitle>For Contributors</CardTitle>
131
+ <CardDescription>
132
+ How to contribute to the evaluation framework
133
+ </CardDescription>
134
+ </CardHeader>
135
+ <CardContent className="space-y-4">
136
+ <div className="p-4 bg-muted rounded-lg">
137
+ <h4 className="font-semibold mb-2">Schema-Driven Architecture</h4>
138
+ <p className="text-sm text-muted-foreground mb-3">
139
+ All evaluation categories, form fields, and data structures are centrally managed in the <code className="bg-background px-1 py-0.5 rounded">schema/</code> folder. This is the primary location for making structural changes to the evaluation framework.
140
+ </p>
141
+ <div className="space-y-2 text-sm">
142
+ <div><code className="bg-background px-2 py-1 rounded">schema/evaluation-schema.json</code> - Evaluation categories and types</div>
143
+ <div><code className="bg-background px-2 py-1 rounded">schema/output-schema.json</code> - Complete data structure</div>
144
+ <div><code className="bg-background px-2 py-1 rounded">schema/system-info-schema.json</code> - Form field options</div>
145
+ <div><code className="bg-background px-2 py-1 rounded">schema/category-details.json</code> - Detailed descriptions</div>
146
+ </div>
147
+ </div>
148
+ <div className="p-4 bg-muted rounded-lg">
149
+ <h4 className="font-semibold mb-2">Adding Evaluations</h4>
150
+ <p className="text-sm text-muted-foreground">
151
+ Evaluation data files are stored in <code className="bg-background px-1 py-0.5 rounded">public/evaluations/</code> as JSON files. Each file represents a complete evaluation of an AI system and must conform to the schema.
152
+ </p>
153
+ </div>
154
+ <Link href="https://huggingface.co/spaces/evaleval/general-eval-card/tree/main/schema" target="_blank">
155
+ <Button className="w-full flex items-center justify-center gap-2">
156
+ <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" className="h-4 w-4" />
157
+ View on Hugging Face
158
+ </Button>
159
+ </Link>
160
+ </CardContent>
161
+ </Card>
162
+
163
+ <Card>
164
+ <CardHeader>
165
+ <CardTitle>Technical Implementation</CardTitle>
166
+ <CardDescription>
167
+ Built with modern web technologies for performance and accessibility
168
+ </CardDescription>
169
+ </CardHeader>
170
+ <CardContent>
171
+ <div className="grid gap-3 md:grid-cols-3">
172
+ <div className="text-center p-3 border rounded-lg">
173
+ <h4 className="font-semibold">Next.js 14</h4>
174
+ <p className="text-xs text-muted-foreground">React framework with SSR</p>
175
+ </div>
176
+ <div className="text-center p-3 border rounded-lg">
177
+ <h4 className="font-semibold">TypeScript</h4>
178
+ <p className="text-xs text-muted-foreground">Type-safe development</p>
179
+ </div>
180
+ <div className="text-center p-3 border rounded-lg">
181
+ <h4 className="font-semibold">Tailwind CSS</h4>
182
+ <p className="text-xs text-muted-foreground">Utility-first styling</p>
183
+ </div>
184
+ </div>
185
+ </CardContent>
186
+ </Card>
187
+ </div>
188
+
189
+ <Separator className="my-8" />
190
+
191
+ <div className="text-center text-sm text-muted-foreground">
192
+ <p className="mt-2">AI Evaluation Dashboard is an open-source project dedicated to advancing responsible AI development — built with ❤️ by the EvalEval Coalition.</p>
193
+ <p className="mt-2 flex items-center justify-center gap-3">
194
+ <img src="https://evalevalai.com/assets/img/logo-square.png" alt="EvalEval" className="h-8 w-8 rounded" />
195
+ <span>Learn more about EvalEval: <Link href="https://evalevalai.com/" target="_blank">evalevalai.com</Link></span>
196
+ </p>
197
+ </div>
198
+ </div>
199
+ )
200
+ }
app/evaluation/[id]/page.client.tsx CHANGED
@@ -5,10 +5,11 @@ import { useState, useEffect } from "react"
5
  import { Button } from "@/components/ui/button"
6
  import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card"
7
  import { Badge } from "@/components/ui/badge"
8
- import { ArrowLeft, Download, Eye, EyeOff } from "lucide-react"
9
  import { getAllCategories, getCategoryById, getBenchmarkQuestions, getProcessQuestions } from "@/lib/schema"
10
  import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
11
  import { naReasonForCategoryFromEval } from "@/lib/na-utils"
 
12
 
13
  const loadEvaluationDetails = async (id: string) => {
14
  const evaluationFiles = [
@@ -180,10 +181,18 @@ export default function EvaluationDetailsPage() {
180
  <ArrowLeft className="h-4 w-4 mr-2" />
181
  Back to Dashboard
182
  </Button>
183
- <Button variant="outline" size="sm">
184
- <Download className="h-4 w-4 mr-2" />
185
- Export Report
186
- </Button>
 
 
 
 
 
 
 
 
187
  </div>
188
 
189
  <div className="mt-3 text-center">
 
5
  import { Button } from "@/components/ui/button"
6
  import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card"
7
  import { Badge } from "@/components/ui/badge"
8
+ import { ArrowLeft, Download, Eye, EyeOff, Info } from "lucide-react"
9
  import { getAllCategories, getCategoryById, getBenchmarkQuestions, getProcessQuestions } from "@/lib/schema"
10
  import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
11
  import { naReasonForCategoryFromEval } from "@/lib/na-utils"
12
+ import Link from "next/link"
13
 
14
  const loadEvaluationDetails = async (id: string) => {
15
  const evaluationFiles = [
 
181
  <ArrowLeft className="h-4 w-4 mr-2" />
182
  Back to Dashboard
183
  </Button>
184
+ <div className="flex items-center gap-2">
185
+ <Link href="/about">
186
+ <Button variant="ghost" size="sm">
187
+ <Info className="h-4 w-4 mr-2" />
188
+ About
189
+ </Button>
190
+ </Link>
191
+ <Button variant="outline" size="sm">
192
+ <Download className="h-4 w-4 mr-2" />
193
+ Export Report
194
+ </Button>
195
+ </div>
196
  </div>
197
 
198
  <div className="mt-3 text-center">
app/page.tsx CHANGED
@@ -3,11 +3,12 @@
3
  import { useState, useMemo, useEffect } from "react"
4
  import { Button } from "@/components/ui/button"
5
  import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from "@/components/ui/select"
6
- import { Plus, Moon, Sun, Filter, ArrowUpDown } from "lucide-react"
7
  import { useTheme } from "next-themes"
8
  import { EvaluationCard, type EvaluationCardData } from "@/components/evaluation-card"
9
  import { getBenchmarkQuestions, getProcessQuestions } from "@/lib/schema"
10
  import { AIEvaluationDashboard } from "@/components/ai-evaluation-dashboard"
 
11
 
12
  const loadEvaluationData = async (): Promise<EvaluationCardData[]> => {
13
  const evaluationFiles = [
@@ -460,6 +461,12 @@ export default function HomePage() {
460
  <p className="text-sm text-muted-foreground">Manage and track your AI system evaluations</p>
461
  </div>
462
  <div className="flex items-center gap-3">
 
 
 
 
 
 
463
  <Button
464
  variant="ghost"
465
  size="sm"
 
3
  import { useState, useMemo, useEffect } from "react"
4
  import { Button } from "@/components/ui/button"
5
  import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from "@/components/ui/select"
6
+ import { Plus, Moon, Sun, Filter, ArrowUpDown, Info } from "lucide-react"
7
  import { useTheme } from "next-themes"
8
  import { EvaluationCard, type EvaluationCardData } from "@/components/evaluation-card"
9
  import { getBenchmarkQuestions, getProcessQuestions } from "@/lib/schema"
10
  import { AIEvaluationDashboard } from "@/components/ai-evaluation-dashboard"
11
+ import Link from "next/link"
12
 
13
  const loadEvaluationData = async (): Promise<EvaluationCardData[]> => {
14
  const evaluationFiles = [
 
461
  <p className="text-sm text-muted-foreground">Manage and track your AI system evaluations</p>
462
  </div>
463
  <div className="flex items-center gap-3">
464
+ <Link href="/about">
465
+ <Button variant="ghost" size="sm" className="gap-2">
466
+ <Info className="h-4 w-4" />
467
+ About
468
+ </Button>
469
+ </Link>
470
  <Button
471
  variant="ghost"
472
  size="sm"