Spaces:

MCP-1st-Birthday
/

TraceMind

Running

App Files Files Community

TraceMind / ARCHITECTURE.md

kshitijthakkar

docs: Fix file references

010ba8f 3 days ago

preview code

raw

history blame contribute delete

31 kB

	# TraceMind-AI - Technical Architecture

	This document provides a deep technical dive into the TraceMind-AI architecture, implementation details, and system design.

	## Table of Contents

	- [System Overview](#system-overview)
	- [Project Structure](#project-structure)
	- [Core Components](#core-components)
	- [MCP Client Architecture](#mcp-client-architecture)
	- [Agent Framework Integration](#agent-framework-integration)
	- [Data Flow](#data-flow)
	- [Authentication & Authorization](#authentication--authorization)
	- [Screen Navigation](#screen-navigation)
	- [Job Submission Architecture](#job-submission-architecture)
	- [Deployment](#deployment)
	- [Performance Optimization](#performance-optimization)

	---

	## System Overview

	TraceMind-AI is a comprehensive Gradio-based web application for evaluating AI agent performance. It serves as the user-facing platform in the TraceMind ecosystem, demonstrating enterprise MCP client usage (Track 2: MCP in Action).

	### Technology Stack

	\| Component \| Technology \| Version \| Purpose \|
	\|-----------\|-----------\|---------\|---------\|
	\| UI Framework \| Gradio \| 5.49.1 \| Web interface with components \|
	\| MCP Client \| MCP Python SDK \| Latest \| Connect to MCP servers \|
	\| Agent Framework \| smolagents \| 1.22.0+ \| Autonomous agent with MCP tools \|
	\| Data Source \| HuggingFace Datasets \| Latest \| Load evaluation results \|
	\| Authentication \| HuggingFace OAuth \| - \| User authentication \|
	\| Job Platforms \| HF Jobs + Modal \| - \| Evaluation job submission \|
	\| Language \| Python \| 3.10+ \| Core implementation \|

	### High-Level Architecture

	```
	┌─────────────────────────────────────────────────────────────┐
	│ User Browser │
	│ - Gradio Interface (React-based) │
	│ - OAuth Flow (HuggingFace) │
	└──────────────┬──────────────────────────────────────────────┘
	│
	│ HTTP/WebSocket
	↓
	┌─────────────────────────────────────────────────────────────┐
	│ TraceMind-AI (Gradio App) - Track 2 │
	│ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Screen Layer (screens/) │ │
	│ │ - Leaderboard │ │
	│ │ - Agent Chat │ │
	│ │ - New Evaluation │ │
	│ │ - Job Monitoring │ │
	│ │ - Trace Detail │ │
	│ │ - Settings │ │
	│ └────────────┬────────────────────────────────────────┘ │
	│ │ │
	│ ┌────────────┴────────────────────────────────────────┐ │
	│ │ Component Layer (components/) │ │
	│ │ - Leaderboard Table (Custom HTML) │ │
	│ │ - Analytics Charts │ │
	│ │ - Metric Displays │ │
	│ │ - Report Cards │ │
	│ └────────────┬────────────────────────────────────────┘ │
	│ │ │
	│ ┌────────────┴────────────────────────────────────────┐ │
	│ │ Service Layer │ │
	│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
	│ │ │ MCP Client │ │ Data Loader │ │ │
	│ │ │ (mcp_client/) │ │ (data_loader.py) │ │ │
	│ │ └──────────────────┘ └──────────────────┘ │ │
	│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
	│ │ │ Agent (smolagents│ │ Job Submission │ │ │
	│ │ │ screens/chat.py) │ │ (utils/) │ │ │
	│ │ └──────────────────┘ └──────────────────┘ │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ │
	└───────────┬───────────────────────────────────┬─────────────┘
	│ │
	↓ ↓
	┌───────────────────────┐ ┌───────────────────────┐
	│ TraceMind MCP Server │ │ External Services │
	│ (Track 1) │ │ - HF Datasets │
	│ - 11 AI Tools │ │ - HF Jobs │
	│ - 3 Resources │ │ - Modal │
	│ - 3 Prompts │ │ - LLM APIs │
	└───────────────────────┘ └───────────────────────┘
	```

	---

	## Project Structure

	```
	TraceMind-AI/
	├── app.py # Main entry point, Gradio app
	│
	├── screens/ # UI screens (6 tabs)
	│ ├── __init__.py
	│ ├── leaderboard.py # Screen 1: Leaderboard with AI insights
	│ ├── chat.py # Screen 2: Agent Chat (smolagents)
	│ ├── dashboard.py # Screen 3: New Evaluation
	│ ├── job_monitoring.py # Screen 4: Job Status Tracking
	│ ├── trace_detail.py # Screen 5: Trace Visualization
	│ ├── settings.py # Screen 6: API Key Configuration
	│ ├── compare.py # Screen 7: Run Comparison (optional)
	│ ├── documentation.py # Screen 8: API Documentation
	│ └── mcp_helpers.py # Shared MCP client helpers
	│
	├── components/ # Reusable UI components
	│ ├── __init__.py
	│ ├── leaderboard_table.py # Custom HTML table component
	│ ├── analytics_charts.py # Performance charts (Plotly)
	│ ├── metric_displays.py # Metric cards and badges
	│ ├── report_cards.py # Summary report cards
	│ └── thought_graph.py # Agent reasoning visualization
	│
	├── mcp_client/ # MCP client implementation
	│ ├── __init__.py
	│ ├── client.py # Async MCP client
	│ └── sync_wrapper.py # Synchronous wrapper for Gradio
	│
	├── utils/ # Utility modules
	│ ├── __init__.py
	│ ├── auth.py # HuggingFace OAuth
	│ ├── navigation.py # Screen navigation state
	│ ├── hf_jobs_submission.py # HuggingFace Jobs integration
	│ └── modal_job_submission.py # Modal integration
	│
	├── styles/ # Custom styling
	│ ├── __init__.py
	│ └── tracemind_theme.py # Gradio theme customization
	│
	├── data_loader.py # Dataset loading and caching
	├── requirements.txt # Python dependencies
	├── .env.example # Environment variable template
	├── .gitignore
	├── README.md # Project documentation
	└── USER_GUIDE.md # Complete user guide

	Total: ~35 files, ~8,000 lines of code
	```

	### File Breakdown

	\| Directory \| Files \| Lines \| Purpose \|
	\|-----------\|-------\|-------\|---------\|
	\| `screens/` \| 9 \| ~3,500 \| UI screen implementations \|
	\| `components/` \| 5 \| ~1,200 \| Reusable UI components \|
	\| `mcp_client/` \| 3 \| ~800 \| MCP client integration \|
	\| `utils/` \| 4 \| ~1,500 \| Authentication, jobs, navigation \|
	\| `styles/` \| 2 \| ~300 \| Custom theme and CSS \|
	\| Root \| 3 \| ~700 \| Main app, data loader, config \|

	---

	## Core Components

	### 1. app.py - Main Application

	Purpose: Entry point, orchestrates all screens and manages global state.

	Architecture:

	```python
	# app.py structure
	import gradio as gr
	from screens import *
	from mcp_client.sync_wrapper import get_sync_mcp_client
	from utils.auth import auth_ui
	from data_loader import DataLoader

	# 1. Initialize services
	mcp_client = get_sync_mcp_client()
	mcp_client.initialize()
	data_loader = DataLoader()

	# 2. Create Gradio app
	with gr.Blocks(theme=tracemind_theme) as app:
	# Global state
	gr.State(...) # User session, navigation, etc.

	# Authentication (if not disabled)
	if not DISABLE_OAUTH:
	auth_ui()

	# Main tabs
	with gr.Tabs():
	with gr.Tab("📊 Leaderboard"):
	leaderboard_screen()

	with gr.Tab("🤖 Agent Chat"):
	chat_screen()

	with gr.Tab("🚀 New Evaluation"):
	dashboard_screen()

	with gr.Tab("📈 Job Monitoring"):
	job_monitoring_screen()

	with gr.Tab("⚙️ Settings"):
	settings_screen()

	# 3. Launch
	if __name__ == "__main__":
	app.launch(
	server_name="0.0.0.0",
	server_port=7860,
	share=False
	)
	```

	Key Responsibilities:
	- Initialize MCP client and data loader (global instances)
	- Create tabbed interface with all screens
	- Manage authentication flow
	- Handle global state (user session, API keys)

	---

	### 2. Screen Layer (screens/)

	Each screen is a self-contained module that returns a Gradio component tree.

	#### screens/leaderboard.py

	Purpose: Display evaluation results with AI-powered insights.

	Components:
	- Load button
	- AI insights panel (Markdown) - powered by MCP server
	- Leaderboard table (custom HTML component)
	- Filter controls (agent type, provider)

	MCP Integration:
	```python
	def load_leaderboard(mcp_client):
	# 1. Load dataset
	ds = load_dataset("kshitijthakkar/smoltrace-leaderboard")
	df = pd.DataFrame(ds)

	# 2. Get AI insights from MCP server
	insights = mcp_client.analyze_leaderboard(
	metric_focus="overall",
	time_range="last_week",
	top_n=5
	)

	# 3. Render table with custom component
	table_html = render_leaderboard_table(df)

	return insights, table_html
	```

	#### screens/chat.py

	Purpose: Autonomous agent interface with MCP tool access.

	Agent Setup:
	```python
	from smolagents import ToolCallingAgent, MCPClient, HfApiModel

	# Initialize agent with MCP client
	def create_agent():
	mcp_client = MCPClient(MCP_SERVER_URL)

	model = HfApiModel(
	model_id="Qwen/Qwen2.5-Coder-32B-Instruct",
	token=os.getenv("HF_TOKEN")
	)

	agent = ToolCallingAgent(
	tools=[], # MCP tools loaded automatically
	model=model,
	mcp_client=mcp_client,
	max_steps=10
	)

	return agent

	# Chat interaction
	def agent_chat(message, history, show_reasoning):
	if show_reasoning:
	agent.verbosity_level = 2 # Show tool execution
	else:
	agent.verbosity_level = 0 # Only final answer

	response = agent.run(message)
	history.append((message, response))

	return history, ""
	```

	MCP Tool Access:
	Agent automatically discovers and uses all 11 MCP tools from TraceMind MCP Server.

	#### screens/dashboard.py

	Purpose: Submit evaluation jobs to HuggingFace Jobs or Modal.

	Key Functions:
	- Model selection (text input)
	- Infrastructure choice (HF Jobs / Modal)
	- Hardware selection (auto / manual)
	- Cost estimation (MCP-powered)
	- Job submission

	Cost Estimation Flow:
	```python
	def estimate_cost_click(model, agent_type, num_tests, hardware, mcp_client):
	# Call MCP server for cost estimate
	estimate = mcp_client.estimate_cost(
	model=model,
	agent_type=agent_type,
	num_tests=num_tests,
	hardware=hardware
	)

	return estimate # Display in dialog
	```

	Job Submission Flow:
	```python
	def submit_job(model, agent_type, hardware, infrastructure, api_keys):
	if infrastructure == "HuggingFace Jobs":
	job_id = submit_hf_job(model, agent_type, hardware, api_keys)
	elif infrastructure == "Modal":
	job_id = submit_modal_job(model, agent_type, hardware, api_keys)

	return f"✅ Job submitted: {job_id}"
	```

	#### screens/job_monitoring.py

	Purpose: Track status of submitted jobs.

	Data Source: HuggingFace Jobs API or Modal API

	Refresh Strategy:
	- Manual refresh button
	- Auto-refresh every 30 seconds (optional)

	#### screens/trace_detail.py

	Purpose: Visualize OpenTelemetry traces with GPU metrics.

	Components:
	- Waterfall diagram (spans timeline)
	- Span details panel
	- GPU metrics overlay (for GPU jobs)
	- MCP-powered Q&A

	Trace Loading:
	```python
	def load_trace(trace_id, traces_repo):
	# Load trace dataset
	ds = load_dataset(traces_repo)
	trace_data = ds.filter(lambda x: x["trace_id"] == trace_id)[0]

	# Render waterfall
	waterfall_html = render_waterfall(trace_data["spans"])

	return waterfall_html
	```

	MCP Q&A:
	```python
	def ask_trace_question(trace_id, traces_repo, question, mcp_client):
	# Call MCP server to debug trace
	answer = mcp_client.debug_trace(
	trace_id=trace_id,
	traces_repo=traces_repo,
	question=question
	)

	return answer
	```

	#### screens/settings.py

	Purpose: Configure API keys and preferences.

	Security:
	- Keys stored in Gradio State (session-only, not server-side)
	- All forms use `api_name=False` (not exposed via API)
	- HTTPS encryption for all API calls

	Configuration Options:
	- Gemini API Key
	- HuggingFace Token
	- Modal Token ID + Secret
	- LLM Provider Keys (OpenAI, Anthropic, etc.)

	---

	### 3. Component Layer (components/)

	Reusable UI components that can be used across multiple screens.

	#### components/leaderboard_table.py

	Purpose: Custom HTML table with sorting, filtering, and styling.

	Why Custom Component?:
	- Gradio's default Dataframe component lacks advanced styling
	- Need clickable rows for navigation
	- Custom sorting and filtering logic
	- Badge rendering for metrics

	Implementation:
	```python
	def render_leaderboard_table(df: pd.DataFrame) -> str:
	"""Render leaderboard as interactive HTML table"""

	html = """
	<style>
	.leaderboard-table { ... }
	.metric-badge { ... }
	</style>
	<table class="leaderboard-table">
	<thead>
	<tr>
	<th onclick="sortTable(0)">Model</th>
	<th onclick="sortTable(1)">Success Rate</th>
	<th onclick="sortTable(2)">Cost</th>
	...
	</tr>
	</thead>
	<tbody>
	"""

	for idx, row in df.iterrows():
	html += f"""
	<tr onclick="selectRun('{row['run_id']}')">
	<td>{row['model']}</td>
	<td><span class="badge success">{row['success_rate']}%</span></td>
	<td>${row['total_cost_usd']:.4f}</td>
	...
	</tr>
	"""

	html += """
	</tbody>
	</table>
	<script>
	function sortTable(col) { ... }
	function selectRun(runId) {
	// Trigger Gradio event to navigate to run detail
	document.dispatchEvent(new CustomEvent('runSelected', {detail: runId}));
	}
	</script>
	"""

	return html
	```

	Integration with Gradio:
	```python
	# In leaderboard screen
	table_html = gr.HTML()

	load_btn.click(
	fn=lambda: render_leaderboard_table(df),
	outputs=table_html
	)
	```

	#### components/analytics_charts.py

	Purpose: Performance charts using Plotly.

	Charts Provided:
	- Success rate over time (line chart)
	- Cost comparison (bar chart)
	- Duration distribution (histogram)
	- CO2 emissions by model (pie chart)

	Example:
	```python
	import plotly.graph_objects as go

	def create_cost_comparison_chart(df):
	fig = go.Figure(data=[
	go.Bar(
	x=df['model'],
	y=df['total_cost_usd'],
	marker_color='indianred'
	)
	])

	fig.update_layout(
	title="Cost Comparison by Model",
	xaxis_title="Model",
	yaxis_title="Total Cost (USD)"
	)

	return fig
	```

	#### components/thought_graph.py

	Purpose: Visualize agent reasoning steps (for Agent Chat).

	Visualization:
	- Graph nodes: Reasoning steps, tool calls
	- Edges: Flow between steps
	- Annotations: Tool results, errors

	---

	### 4. MCP Client Layer (mcp_client/)

	#### mcp_client/client.py - Async MCP Client

	Purpose: Connect to TraceMind MCP Server via MCP protocol.

	Implementation: (See [MCP_INTEGRATION.md](MCP_INTEGRATION.md) for full code)

	Key Methods:
	- `connect()`: Establish SSE connection to MCP server
	- `call_tool(tool_name, arguments)`: Call an MCP tool
	- `analyze_leaderboard(**kwargs)`: Wrapper for analyze_leaderboard tool
	- `estimate_cost(**kwargs)`: Wrapper for estimate_cost tool
	- `debug_trace(**kwargs)`: Wrapper for debug_trace tool

	#### mcp_client/sync_wrapper.py - Synchronous Wrapper

	Purpose: Provide synchronous API for Gradio event handlers.

	Why Needed?: Gradio event handlers are synchronous, but MCP client is async.

	Pattern:
	```python
	class SyncMCPClient:
	def __init__(self, mcp_server_url):
	self.async_client = AsyncMCPClient(mcp_server_url)

	def _run_async(self, coro):
	"""Run async coroutine in sync context"""
	loop = asyncio.get_event_loop()
	return loop.run_until_complete(coro)

	def analyze_leaderboard(self, **kwargs):
	"""Synchronous wrapper"""
	return self._run_async(self.async_client.analyze_leaderboard(**kwargs))
	```

	---

	### 5. Data Loader (data_loader.py)

	Purpose: Load and cache HuggingFace datasets.

	Features:
	- In-memory caching (5-minute TTL)
	- Error handling for missing datasets
	- Automatic retry logic
	- Dataset validation

	Implementation:
	```python
	from datasets import load_dataset
	from functools import lru_cache
	import time

	class DataLoader:
	def __init__(self):
	self.cache = {}
	self.cache_ttl = 300 # 5 minutes

	def load_leaderboard(self, repo="kshitijthakkar/smoltrace-leaderboard"):
	"""Load leaderboard with caching"""
	cache_key = f"leaderboard:{repo}"

	# Check cache
	if cache_key in self.cache:
	cached_time, cached_data = self.cache[cache_key]
	if time.time() - cached_time < self.cache_ttl:
	return cached_data

	# Load fresh data
	ds = load_dataset(repo, split="train")
	df = pd.DataFrame(ds)

	# Cache
	self.cache[cache_key] = (time.time(), df)

	return df

	def load_results(self, repo):
	"""Load results dataset for specific run"""
	ds = load_dataset(repo, split="train")
	return pd.DataFrame(ds)

	def load_traces(self, repo):
	"""Load traces dataset for specific run"""
	ds = load_dataset(repo, split="train")
	return ds # Keep as Dataset for filtering
	```

	---

	## MCP Client Architecture

	Full details in: [MCP_INTEGRATION.md](MCP_INTEGRATION.md)

	Summary:
	- Async Client: `mcp_client/client.py` - async MCP protocol implementation
	- Sync Wrapper: `mcp_client/sync_wrapper.py` - synchronous API for Gradio
	- Global Instance: Initialized once in `app.py`, shared across all screens

	Usage Pattern:
	```python
	# In app.py (initialization)
	from mcp_client.sync_wrapper import get_sync_mcp_client
	mcp_client = get_sync_mcp_client()
	mcp_client.initialize()

	# In screen (usage)
	def some_event_handler(mcp_client):
	result = mcp_client.analyze_leaderboard(metric_focus="cost")
	return result
	```

	---

	## Agent Framework Integration

	Full details in: [MCP_INTEGRATION.md](MCP_INTEGRATION.md)

	Framework: smolagents (HuggingFace's agent framework)

	Key Features:
	- Autonomous tool discovery from MCP server
	- Multi-step reasoning with tool chaining
	- Context-aware responses
	- Reasoning visualization (optional)

	Agent Setup:
	```python
	from smolagents import ToolCallingAgent, MCPClient

	agent = ToolCallingAgent(
	tools=[], # Empty - tools loaded from MCP server
	model=HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct"),
	mcp_client=MCPClient(MCP_SERVER_URL),
	max_steps=10
	)
	```

	---

	## Data Flow

	### Leaderboard Loading Flow

	```
	1. User clicks "Load Leaderboard"
	↓
	2. Gradio Event Handler (leaderboard.py)
	load_leaderboard()
	↓
	3. Data Loader (data_loader.py)
	├─→ Check cache (5-min TTL)
	│ └─→ If cached: return cached data
	└─→ If not cached: load from HF Datasets
	└─→ load_dataset("kshitijthakkar/smoltrace-leaderboard")
	↓
	4. MCP Client (sync_wrapper.py)
	mcp_client.analyze_leaderboard(metric_focus="overall")
	↓
	5. MCP Server (TraceMind-mcp-server)
	├─→ Load data
	├─→ Call Gemini API
	└─→ Return AI analysis
	↓
	6. Render Components
	├─→ AI Insights (Markdown)
	└─→ Leaderboard Table (Custom HTML)
	↓
	7. Display to User
	```

	### Agent Chat Flow

	```
	1. User types message: "What are the top 3 models?"
	↓
	2. Gradio Event Handler (chat.py)
	agent_chat(message, history, show_reasoning)
	↓
	3. smolagents Agent
	agent.run(message)
	├─→ Step 1: Plan approach
	│ └─→ "Need to get top models from leaderboard"
	├─→ Step 2: Discover MCP tools
	│ └─→ Found: get_top_performers, analyze_leaderboard
	├─→ Step 3: Call MCP tool
	│ └─→ get_top_performers(metric="success_rate", top_n=3)
	├─→ Step 4: Parse result
	│ └─→ Extract model names, success rates, costs
	└─→ Step 5: Format response
	└─→ Generate markdown table with insights
	↓
	4. Return to user with full reasoning trace (if enabled)
	```

	### Job Submission Flow

	```
	1. User fills form → Clicks "Submit Evaluation"
	↓
	2. Gradio Event Handler (dashboard.py)
	submit_job(model, agent_type, hardware, infrastructure)
	↓
	3. Job Submission Module (utils/)
	if infrastructure == "HuggingFace Jobs":
	├─→ hf_jobs_submission.py
	├─→ Build job config (YAML)
	├─→ Submit via HF Jobs API
	└─→ Return job_id
	elif infrastructure == "Modal":
	├─→ modal_job_submission.py
	├─→ Build Modal app config
	├─→ Submit via Modal SDK
	└─→ Return job_id
	↓
	4. Store job_id in session state
	↓
	5. Redirect to Job Monitoring screen
	↓
	6. Auto-refresh status every 30s
	```

	---

	## Authentication & Authorization

	### HuggingFace OAuth

	Implementation: `utils/auth.py`

	Flow:
	```
	1. User visits TraceMind-AI
	↓
	2. Check OAuth token in session
	├─→ If valid: proceed to app
	└─→ If invalid: show login screen
	↓
	3. User clicks "Sign in with HuggingFace"
	↓
	4. Redirect to HuggingFace OAuth page
	├─→ User authorizes TraceMind-AI
	└─→ HF redirects back with token
	↓
	5. Store token in Gradio State (session)
	↓
	6. Use token for:
	├─→ HF Datasets access
	├─→ HF Jobs submission
	└─→ User identification
	```

	Code:
	```python
	# utils/auth.py
	import gradio as gr

	def auth_ui():
	"""Create OAuth login UI"""
	gr.LoginButton(
	value="Sign in with HuggingFace",
	auth_provider="huggingface"
	)

	# In app.py
	with gr.Blocks() as app:
	if not DISABLE_OAUTH:
	auth_ui()
	```

	### API Key Storage

	Strategy: Session-only storage (not server-side persistence)

	Implementation:
	```python
	# In settings screen
	def save_api_keys(gemini_key, hf_token):
	"""Store keys in session state"""
	session_state = gr.State({
	"gemini_key": gemini_key,
	"hf_token": hf_token
	})

	# Override default clients with user keys
	if gemini_key:
	os.environ["GEMINI_API_KEY"] = gemini_key
	if hf_token:
	os.environ["HF_TOKEN"] = hf_token

	return "✅ API keys saved for this session"
	```

	Security:
	- ✅ Keys stored only in browser memory
	- ✅ Not saved to disk or database
	- ✅ Forms use `api_name=False` (not exposed via API)
	- ✅ HTTPS encryption

	---

	## Screen Navigation

	### State Management

	Pattern: Gradio State components for session data

	```python
	# In app.py
	with gr.Blocks() as app:
	# Global state
	session_state = gr.State({
	"user": None,
	"current_run_id": None,
	"current_trace_id": None,
	"api_keys": {}
	})

	# Pass to all screens
	leaderboard_screen(session_state)
	chat_screen(session_state)
	```

	### Navigation Between Screens

	Pattern: Click event triggers tab switch + state update

	```python
	# In leaderboard screen
	def row_click(run_id, session_state):
	"""Navigate to run detail when row clicked"""
	session_state["current_run_id"] = run_id

	# Switch to trace detail tab (Tab index 4)
	return gr.Tabs.update(selected=4), session_state

	table_component.select(
	fn=row_click,
	inputs=[gr.State(), session_state],
	outputs=[main_tabs, session_state]
	)
	```

	---

	## Job Submission Architecture

	### HuggingFace Jobs Integration

	File: `utils/hf_jobs_submission.py`

	Key Functions:
	```python
	def submit_hf_job(model, agent_type, hardware, api_keys):
	"""Submit evaluation job to HuggingFace Jobs"""

	# 1. Build job config (YAML)
	job_config = {
	"name": f"SMOLTRACE Eval - {model}",
	"hardware": hardware, # cpu-basic, t4-small, a10g-small, a100-large, h200
	"environment": {
	"MODEL": model,
	"AGENT_TYPE": agent_type,
	"HF_TOKEN": api_keys["hf_token"],
	# ... other env vars
	},
	"command": [
	"pip install smoltrace[otel,gpu]",
	f"smoltrace-eval --model {model} --agent-type {agent_type} ..."
	]
	}

	# 2. Submit via HF Jobs API
	response = requests.post(
	"https://huggingface.co/api/jobs",
	headers={"Authorization": f"Bearer {api_keys['hf_token']}"},
	json=job_config
	)

	# 3. Return job ID
	job_id = response.json()["id"]
	return job_id
	```

	### Modal Integration

	File: `utils/modal_job_submission.py`

	Key Functions:
	```python
	import modal

	def submit_modal_job(model, agent_type, hardware, api_keys):
	"""Submit evaluation job to Modal"""

	# 1. Create Modal app
	app = modal.App("smoltrace-eval")

	# 2. Define function with GPU
	@app.function(
	image=modal.Image.debian_slim().pip_install("smoltrace[otel,gpu]"),
	gpu=hardware, # A10, A100-80GB, H200
	secrets=[
	modal.Secret.from_dict({
	"HF_TOKEN": api_keys["hf_token"],
	# ... other secrets
	})
	]
	)
	def run_evaluation():
	import smoltrace
	# Run evaluation
	results = smoltrace.evaluate(model=model, agent_type=agent_type)
	return results

	# 3. Deploy and run
	with app.run():
	result = run_evaluation.remote()

	return result.job_id
	```

	---

	## Deployment

	### HuggingFace Spaces

	Platform: HuggingFace Spaces
	SDK: Gradio 5.49.1
	Hardware: CPU Basic (upgradeable)
	URL: https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind

	### Configuration

	Space Metadata (README.md header):
	```yaml
	---
	title: TraceMind AI
	emoji: 🧠
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: 5.49.1
	app_file: app.py
	short_description: AI agent evaluation with MCP-powered intelligence
	license: agpl-3.0
	pinned: true
	tags:
	- mcp-in-action-track-enterprise
	- agent-evaluation
	- mcp-client
	- leaderboard
	- gradio
	---
	```

	### Environment Variables

	Set in HF Spaces Secrets:
	```bash
	# Required
	GEMINI_API_KEY=your_gemini_key
	HF_TOKEN=your_hf_token

	# Optional
	MCP_SERVER_URL=https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse
	LEADERBOARD_REPO=kshitijthakkar/smoltrace-leaderboard
	DISABLE_OAUTH=false # Set to true for local development
	```

	---

	## Performance Optimization

	### 1. Data Caching

	Implementation: `data_loader.py`
	- In-memory cache with 5-minute TTL
	- Reduces HF Datasets API calls
	- Faster page loads

	### 2. Async MCP Calls

	Pattern: Use async for non-blocking I/O
	```python
	# Could be optimized to run in parallel
	async def load_data_with_insights():
	leaderboard_task = load_dataset_async(...)
	insights_task = mcp_client.analyze_leaderboard_async(...)

	leaderboard, insights = await asyncio.gather(leaderboard_task, insights_task)
	return leaderboard, insights
	```

	### 3. Component Lazy Loading

	Strategy: Load components only when tabs are activated
	```python
	with gr.Tab("Trace Detail", visible=False) as trace_tab:
	# Components created only when tab first shown
	@trace_tab.select
	def load_trace_components():
	return build_trace_visualization()
	```

	---

	## Related Documentation

	- [README.md](README.md) - Overview and quick start
	- [USER_GUIDE.md](USER_GUIDE.md) - Complete screen-by-screen guide
	- [MCP_INTEGRATION.md](MCP_INTEGRATION.md) - MCP client implementation
	- [TraceMind MCP Server](https://github.com/Mandark-droid/TraceMind-mcp-server/blob/main/ARCHITECTURE.md) - Server-side architecture

	---

	Last Updated: November 21, 2025
	Version: 1.0.0
	Track: MCP in Action (Enterprise)