kshitijthakkar commited on
Commit
e389acb
Β·
1 Parent(s): 010ba8f

docs: Add annotated screenshots guide with 39 images

Browse files
Files changed (1) hide show
  1. SCREENSHOTS.md +318 -0
SCREENSHOTS.md ADDED
@@ -0,0 +1,318 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # TraceMind-AI - Screenshots & Visual Guide
2
+
3
+ This document provides annotated screenshots of all screens in TraceMind-AI to help you understand the interface at a glance.
4
+
5
+ > **Live Demo**: https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind
6
+
7
+ ---
8
+
9
+ ## πŸ“Š Screen 1: Leaderboard
10
+
11
+ **Purpose**: Browse all agent evaluation runs with AI-powered insights
12
+
13
+ The Leaderboard is the central hub for comparing agent performance across different models, configurations, and benchmarks.
14
+
15
+ ### Leaderboard Table
16
+ ![Leaderboard Table](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_leaderboard_tab.png)
17
+
18
+ The main leaderboard table displays all evaluation runs with sortable columns including model name, agent type, success rate, token usage, duration, cost, and CO2 emissions. Click any row to drill down into detailed run results. Use the search and filter options to find specific runs.
19
+
20
+ ### Summary Cards
21
+ ![Summary Cards](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_Summary_card.png)
22
+
23
+ Quick-glance summary cards show key metrics: total evaluations, average success rate, total tokens processed, and estimated costs. These cards update dynamically based on your current filter selection.
24
+
25
+ ### AI Insights Tab
26
+ ![AI Insights](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_ai_insights_tab.png)
27
+
28
+ AI-powered analysis of the leaderboard data, generated using the TraceMind MCP Server's `analyze_leaderboard` tool. Provides intelligent summaries of trends, top performers, and recommendations based on your evaluation history.
29
+
30
+ ### Analytics - Cost Efficiency
31
+ ![Cost Efficiency](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_analytics_tab_cost_efficiency.png)
32
+
33
+ Interactive chart comparing cost efficiency across different models. Visualizes the relationship between accuracy achieved and cost per evaluation, helping you identify the most cost-effective models for your use case.
34
+
35
+ ### Analytics - Performance Heatmap
36
+ ![Performance Heatmap](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_analytics_tab_performance_heatmap.png)
37
+
38
+ Heatmap visualization showing performance patterns across different test categories and models. Darker colors indicate higher success rates, making it easy to spot strengths and weaknesses of each model.
39
+
40
+ ### Analytics - Speed vs Accuracy
41
+ ![Speed vs Accuracy](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_analytics_tab_speed_vs_accuracy.png)
42
+
43
+ Scatter plot comparing execution speed against accuracy for all runs. Helps identify models that offer the best balance of speed and quality for time-sensitive applications.
44
+
45
+ ### Trends Tab
46
+ ![Trends](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/leaderboard/Leaderboard_screen_trends_tab.png)
47
+
48
+ Historical trends showing how model performance has evolved over time. Track improvements in accuracy, cost reduction, and speed optimization across your evaluation history.
49
+
50
+ ---
51
+
52
+ ## πŸ€– Screen 2: Agent Chat
53
+
54
+ **Purpose**: Interactive autonomous agent powered by MCP tools
55
+
56
+ The Agent Chat provides a conversational interface to interact with the TraceMind MCP Server. Ask questions about your evaluations, request analysis, or generate insights using natural language.
57
+
58
+ ### Chat Interface (Part 1)
59
+ ![Agent Chat Part 1](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/agent_chat/Agent_chat_screen_part_1.png)
60
+
61
+ The chat interface header and input area. Type natural language queries like "What was my best performing model last week?" or "Compare GPT-4 vs Claude on code generation tasks." The agent autonomously selects and executes appropriate MCP tools.
62
+
63
+ ### Chat Interface (Part 2)
64
+ ![Agent Chat Part 2](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/agent_chat/Agent_chat_screen_part_2.png)
65
+
66
+ Example conversation showing the agent executing MCP tools to answer questions. Notice how the agent shows its reasoning process and which tools it's using, providing full transparency into the analysis workflow.
67
+
68
+ ### Chat Interface (Part 3)
69
+ ![Agent Chat Part 3](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/agent_chat/Agent_chat_screen_part_3.png)
70
+
71
+ Extended conversation demonstrating multi-turn interactions. The agent maintains context across messages, allowing for follow-up questions and iterative exploration of your evaluation data.
72
+
73
+ ---
74
+
75
+ ## πŸš€ Screen 3: New Evaluation
76
+
77
+ **Purpose**: Submit evaluation jobs to HuggingFace Jobs or Modal
78
+
79
+ Configure and submit new agent evaluation jobs directly from the UI. Supports both API-based models (via LiteLLM) and local models (via Transformers).
80
+
81
+ ### Configuration Form (Part 1)
82
+ ![New Evaluation Part 1](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/new_evaluation/new_eval_screen_part_1.png)
83
+
84
+ Model selection and basic configuration. Choose from supported models (OpenAI, Anthropic, Llama, etc.), select agent type (ToolCallingAgent or CodeAgent), and configure the evaluation benchmark dataset.
85
+
86
+ ### Configuration Form (Part 2)
87
+ ![New Evaluation Part 2](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/new_evaluation/new_eval_screen_part_2.png)
88
+
89
+ Advanced options including hardware selection (CPU, A10 GPU, H200 GPU), number of test cases, timeout settings, and OpenTelemetry instrumentation options for detailed tracing.
90
+
91
+ ### Cost Estimation
92
+ ![Cost Estimation](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/new_evaluation/new_eval_screen_estimate_cost.png)
93
+
94
+ Real-time cost estimation before submitting your job. Shows estimated compute costs, API costs (for LiteLLM models), total cost, and estimated duration. Uses the TraceMind MCP Server's `estimate_cost` tool for accurate predictions.
95
+
96
+ ### Submit Evaluation
97
+ ![Submit Evaluation](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/new_evaluation/new_eval_screen_submit_eval.png)
98
+
99
+ Final review and submission screen. Confirm your configuration, review the estimated costs, and submit the job to either HuggingFace Jobs or Modal for execution.
100
+
101
+ ---
102
+
103
+ ## πŸ“ˆ Screen 4: Job Monitoring
104
+
105
+ **Purpose**: Track status of submitted evaluation jobs
106
+
107
+ Monitor the progress and status of all your submitted evaluation jobs in real-time.
108
+
109
+ ### Recent Jobs
110
+ ![Recent Jobs](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/job_monitoring/Job_monitoring_screen_recent_jobs_tab.png)
111
+
112
+ List of recently submitted jobs with status indicators (Pending, Running, Completed, Failed). Shows job ID, model, submission time, and current progress. Click any job to view detailed logs.
113
+
114
+ ### Inspect Jobs
115
+ ![Inspect Jobs](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/job_monitoring/Job_monitoring_screen_inspect_jobs_tab.png)
116
+
117
+ Detailed job inspection view showing real-time logs, resource utilization, and progress metrics. Useful for debugging failed jobs or monitoring long-running evaluations.
118
+
119
+ ---
120
+
121
+ ## πŸ“‹ Screen 5: Run Details
122
+
123
+ **Purpose**: View detailed results for a specific evaluation run
124
+
125
+ Deep dive into the results of a completed evaluation run with comprehensive metrics and visualizations.
126
+
127
+ ### Overview Tab
128
+ ![Run Details Overview](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/run_details/Run_details_screen_Overview_tab.png)
129
+
130
+ High-level summary of the evaluation run including success rate, total tokens, cost breakdown, and execution time. Quick-access buttons to view traces, compare with other runs, or download results.
131
+
132
+ ### Test Cases Tab
133
+ ![Test Cases](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/run_details/Run_details_screen_test_cases_tab.png)
134
+
135
+ Detailed breakdown of individual test cases. Shows each test's prompt, expected output, actual response, success/failure status, and execution metrics. Click any test case to view its full trace.
136
+
137
+ ### Performance Tab
138
+ ![Performance](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/run_details/Run_details_screen_performance_tab.png)
139
+
140
+ Performance charts showing token distribution, latency breakdown, and cost analysis per test case. Identify bottlenecks and outliers in your evaluation run.
141
+
142
+ ### AI Insights Tab
143
+ ![AI Insights](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/run_details/Run_details_screen_ai_insights_tab.png)
144
+
145
+ AI-generated analysis of this specific run, powered by the TraceMind MCP Server's `analyze_results` tool. Provides detailed breakdown of failure patterns, success factors, and recommendations for improvement.
146
+
147
+ ### GPU Metrics Tab
148
+ ![GPU Metrics](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/run_details/Run_details_screen_gpu_metrics_tab.png)
149
+
150
+ GPU utilization metrics for runs executed on GPU hardware (A10 or H200). Shows memory usage, compute utilization, temperature, and power consumption over time. Only available for GPU-accelerated jobs.
151
+
152
+ ---
153
+
154
+ ## πŸ” Screen 6: Trace Visualization
155
+
156
+ **Purpose**: Deep-dive into agent execution traces with OpenTelemetry data
157
+
158
+ Explore the complete execution flow of individual agent runs using OTEL trace data.
159
+
160
+ ### Waterfall View
161
+ ![Trace Waterfall](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/trace_detail/Trace_detail_screen_waterfall_tab.png)
162
+
163
+ Timeline visualization of the agent's execution flow. Shows the sequence of LLM calls, tool invocations, and reasoning steps with precise timing information. Hover over any span for detailed attributes.
164
+
165
+ ### Span Details Tab
166
+ ![Span Details](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/trace_detail/Trace_detail_screen_span_details_tab.png)
167
+
168
+ Detailed view of individual spans in the trace. Shows span name, parent-child relationships, duration, and all OpenTelemetry attributes including token counts, model parameters, and tool inputs/outputs.
169
+
170
+ ### Thought Graph Tab
171
+ ![Thought Graph](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/trace_detail/Trace_detail_screen_thought_graph_tab.png)
172
+
173
+ Visual graph representation of the agent's reasoning process. Shows how thoughts, tool calls, and observations connect to form the agent's decision-making flow. Great for understanding complex multi-step reasoning.
174
+
175
+ ### Raw Data Tab
176
+ ![Raw Data](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/trace_detail/Trace_detail_screen_raw_data_tab.png)
177
+
178
+ Full JSON export of the OTEL trace data for advanced analysis or integration with external observability tools. Copy or download the complete trace for offline analysis.
179
+
180
+ ### About This Trace
181
+ ![About Trace](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/trace_detail/Trace_detail_screen_about_this_trace.png)
182
+
183
+ Summary information about the trace including trace ID, associated run, test case, total duration, and span count. Provides context for understanding what this trace represents.
184
+
185
+ ---
186
+
187
+ ## βš–οΈ Screen 7: Compare Runs
188
+
189
+ **Purpose**: Side-by-side comparison of evaluation runs
190
+
191
+ Compare multiple evaluation runs to understand performance differences between models, configurations, or time periods.
192
+
193
+ ### Side-by-Side Comparison
194
+ ![Side by Side](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/compare_runs/Compare_run_screen_side-by-side-tab.png)
195
+
196
+ Direct comparison table showing metrics from two selected runs side by side. Highlights differences in success rate, cost, speed, and token usage with clear visual indicators.
197
+
198
+ ### Report Card
199
+ ![Report Card](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/compare_runs/Compare_run_screen_report_card-tab.png)
200
+
201
+ Generated comparison report with winner/loser indicators for each metric category. Provides a quick summary of which run performs better and by how much.
202
+
203
+ ### Radar Comparison
204
+ ![Radar Comparison](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/compare_runs/Compare_run_screen_Radar_comparision-tab.png)
205
+
206
+ Radar chart visualization comparing runs across multiple dimensions: accuracy, speed, cost efficiency, token efficiency, and consistency. Quickly identify trade-offs between different configurations.
207
+
208
+ ### AI Insights
209
+ ![Compare AI Insights](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/compare_runs/Compare_run_screen_AI_Insights_tab.png)
210
+
211
+ AI-powered comparison analysis using the TraceMind MCP Server's `compare_runs` tool. Provides intelligent narrative explaining the key differences, likely causes, and recommendations for choosing between models.
212
+
213
+ ---
214
+
215
+ ## πŸ§ͺ Screen 8: Synthetic Data Generator
216
+
217
+ **Purpose**: Generate custom test datasets with AI
218
+
219
+ Create custom evaluation datasets tailored to your specific use case using AI-powered generation.
220
+
221
+ ### Generator Interface
222
+ ![Synthetic Data Generator](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/synthetic_data/Synthetic_data_generator_screen_general.png)
223
+
224
+ Configure synthetic data generation by specifying the domain, complexity level, number of test cases, and any specific requirements. Uses the TraceMind MCP Server's `generate_test_cases` tool to create diverse, realistic test scenarios. Preview generated cases before saving to your evaluation dataset.
225
+
226
+ ---
227
+
228
+ ## βš™οΈ Screen 9: Settings
229
+
230
+ **Purpose**: Configure API keys and preferences
231
+
232
+ Manage your TraceMind configuration including API keys, default settings, and integration options.
233
+
234
+ ### Settings (Part 1)
235
+ ![Settings Part 1](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/settings/settings_screen_part_1.png)
236
+
237
+ API key configuration for various providers: OpenAI, Anthropic, HuggingFace, Google Gemini. Keys are securely stored and masked after entry. Test connection buttons verify your keys are working.
238
+
239
+ ### Settings (Part 2)
240
+ ![Settings Part 2](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/settings/settings_screen_part_2.png)
241
+
242
+ Additional settings including default model selection, MCP server connection settings, notification preferences, and data export options. Configure TraceMind to match your workflow preferences.
243
+
244
+ ---
245
+
246
+ ## πŸ“š Screen 10: Documentation
247
+
248
+ **Purpose**: In-app documentation and guides
249
+
250
+ Comprehensive documentation accessible directly within TraceMind, covering all features and integrations.
251
+
252
+ ### About Tab
253
+ ![Documentation About](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/documentation/documentation_screen_about_tab.png)
254
+
255
+ Overview of TraceMind-AI including its purpose, key features, and the ecosystem it's part of. Includes quick links to demo videos, GitHub repositories, and community resources.
256
+
257
+ ### TraceVerde Tab
258
+ ![TraceVerde Docs](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/documentation/documentation_screen_traceverde_tab.png)
259
+
260
+ Documentation for TraceVerde (genai_otel_instrument), the OpenTelemetry instrumentation library that powers TraceMind's tracing capabilities. Shows installation, usage examples, and supported frameworks.
261
+
262
+ ### SMOLTRACE Tab
263
+ ![SMOLTRACE Docs](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/documentation/documentation_screen_smoltrace_tab.png)
264
+
265
+ Documentation for SMOLTRACE, the evaluation engine backend. Covers configuration options, benchmark datasets, and integration with HuggingFace datasets for storing evaluation results.
266
+
267
+ ### TraceMind MCP Server Tab
268
+ ![MCP Server Docs](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/documentation/documentation_screen_tracemind_mcp_server_tab.png)
269
+
270
+ Complete documentation for the TraceMind MCP Server including all 11 tools, 3 resources, and 3 prompts. Shows how to connect the MCP server to Claude Desktop, Cursor, or other MCP clients.
271
+
272
+ ### Job Submission Tab
273
+ ![Job Submission Docs](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/documentation/documentation_screen_job_submission_tab.png)
274
+
275
+ Guide to submitting evaluation jobs via HuggingFace Jobs or Modal. Covers hardware options (CPU, A10, H200), cost considerations, and best practices for running large-scale evaluations.
276
+
277
+ ---
278
+
279
+ ## πŸ“Š Dashboard
280
+
281
+ ### Dashboard Overview
282
+ ![Dashboard](https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/screenshots/dashboard/Dashboard_tab.png)
283
+
284
+ The main dashboard provides a unified view of your TraceMind activity: recent evaluations, quick stats, trending models, and shortcut buttons to common actions. This is your starting point for navigating the full TraceMind-AI experience.
285
+
286
+ ---
287
+
288
+ ## πŸ“¦ Screenshot Summary
289
+
290
+ | Screen | Screenshots | Key Features |
291
+ |--------|-------------|--------------|
292
+ | Leaderboard | 7 | Sortable table, summary cards, AI insights, analytics charts, trends |
293
+ | Agent Chat | 3 | Natural language queries, MCP tool execution, multi-turn conversations |
294
+ | New Evaluation | 4 | Model selection, hardware config, cost estimation, job submission |
295
+ | Job Monitoring | 2 | Job status tracking, real-time logs, progress monitoring |
296
+ | Run Details | 5 | Overview metrics, test cases, performance charts, AI analysis, GPU metrics |
297
+ | Trace Visualization | 5 | Waterfall timeline, span details, thought graph, raw OTEL data |
298
+ | Compare Runs | 4 | Side-by-side metrics, report card, radar chart, AI comparison |
299
+ | Synthetic Data | 1 | AI-powered test case generation, domain configuration |
300
+ | Settings | 2 | API key management, default preferences, MCP connection |
301
+ | Documentation | 5 | About, TraceVerde, SMOLTRACE, MCP Server, Job Submission guides |
302
+ | Dashboard | 1 | Activity overview, quick stats, navigation shortcuts |
303
+ | **Total** | **39** | Complete UI coverage with explanatory descriptions |
304
+
305
+ ---
306
+
307
+ ## πŸ”— Related Documentation
308
+
309
+ - [README.md](README.md) - Quick start guide
310
+ - [USER_GUIDE.md](USER_GUIDE.md) - Complete walkthrough
311
+ - [MCP_INTEGRATION.md](MCP_INTEGRATION.md) - Technical MCP details
312
+ - [ARCHITECTURE.md](ARCHITECTURE.md) - System architecture
313
+
314
+ ---
315
+
316
+ **Status**: βœ… Screenshots Complete - 39 annotated images organized and deployed
317
+
318
+ **Last Updated**: November 2025