Spaces:

mindchain
/

react-blog

Running

react-blog / index.html

mindchain

Add DSPy + GEPA as 9th component - Reliability Layer for enterprise consistency

d4e420c 7 days ago

14.8 kB

	<!DOCTYPE html>
	<html lang="de">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>Claude Code Skills</title>
	<style>
	* { margin: 0; padding: 0; box-sizing: border-box; }
	body {
	font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
	background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
	min-height: 100vh;
	padding: 20px;
	}
	.container {
	max-width: 800px;
	margin: 0 auto;
	}
	header {
	text-align: center;
	color: white;
	margin-bottom: 40px;
	}
	header h1 {
	font-size: 3rem;
	margin-bottom: 10px;
	}
	header p {
	font-size: 1.2rem;
	opacity: 0.9;
	}
	.post {
	background: white;
	border-radius: 12px;
	padding: 30px;
	margin-bottom: 20px;
	box-shadow: 0 10px 30px rgba(0,0,0,0.2);
	}
	.post h2 {
	color: #333;
	margin-bottom: 5px;
	font-size: 1.5rem;
	}
	.post .tag {
	display: inline-block;
	background: #667eea;
	color: white;
	padding: 3px 10px;
	border-radius: 20px;
	font-size: 0.75rem;
	margin-bottom: 10px;
	}
	.post .date {
	color: #888;
	font-size: 0.85rem;
	margin-bottom: 15px;
	}
	.post .content {
	color: #444;
	line-height: 1.7;
	white-space: pre-line;
	}
	.link {
	color: #667eea;
	text-decoration: none;
	font-weight: 500;
	}
	.link:hover {
	text-decoration: underline;
	}
	</style>
	</head>
	<body>
	<div class="container">
	<header>
	<h1>🤖 Claude Code Skills</h1>
	<p>Dokumentation und Übersicht nützlicher Skills & Tools</p>
	</header>

	<div class="post">
	<span class="tag">HuggingFace MCP</span>
	<h2>🐋 Docker Dynamic MCP + HuggingFace</h2>
	<div class="date">30. Dez 2025 • Neun Komponenten, ein Ökosystem</div>
	<div class="content"><strong>Neun Dinge kommen hier zusammen:</strong>

	<strong>1. Docker Dynamic MCP Gateway</strong> (Infrastructure)
	Das Gateway macht alle Tools on-demand verfügbar - ohne Context-Overhead.
	<a href="https://docs.docker.com/ai/mcp-catalog-and-toolkit/dynamic-mcp/" class="link">Docker Dynamic MCP Docs</a>
	• 272+ MCP Server im Catalog
	• On-demand Activation ohne Token-Overhead
	• Zentraler Registry für alle MCP Tools

	<strong>2. Docker MCP Server</strong> (Standard im Gateway)
	Docker-Befehle direkt als MCP Tools - vollständige Container-Steuerung.
	<a href="https://docs.docker.com/ai/mcp-catalog-and-toolkit/server-docker/" class="link">Docker MCP Server Docs</a>
	• Container starten, stoppen, verwalten
	• Images bauen, pullen, pushen
	• Docker Compose, Volumes, Networks
	• GPU-Container für ML Workloads
	• <span style="color: #667eea;">Standardmäßig im Gateway aktiviert!</span>

	<strong>3. Claude Code Skills</strong> (HF Model Training)
	HuggingFace Skills für Model-Training und Publish-Workflows.
	<a href="https://github.com/huggingface/skills" class="link">GitHub: huggingface/skills</a>
	<a href="https://huggingface.co/blog/hf-skills-training" class="link">Blog Post: HF Skills Training</a>
	• SFT, DPO, GRPO Training auf HF Jobs
	• Papers veröffentlichen, Datasets erstellen, Evaluations

	<strong>4. OpenAI Codex Skills</strong> (Code Generation)
	OpenAI Codex direkt in Claude Code für Code-Workflows.
	<a href="https://github.com/huggingface/skills" class="link">GitHub: huggingface/skills</a>
	<a href="https://huggingface.co/blog/hf-skills-training-codex" class="link">Blog Post: HF Codex Integration</a>
	• Code-Generierung und -Vervollständigung
	• Bug-Finding und Fixing, Refactoring
	• Dokumentation schreiben, Unit-Tests erstellen

	<strong>5. Beads</strong> (AI Task-Tracking)
	Git-backed graph issue tracker für AI Agents.
	<a href="https://github.com/steveyegge/beads" class="link">GitHub: steveyegge/beads</a>
	• Persistenter Graph-Speicher für LLM Tasks
	• Dependency Tracking zwischen Tasks
	• Hash-basierte IDs vermeiden Konflikte
	• Git-backed - alle Tasks sind versioniert

	<strong>6. Gemma Scope 2 + Neuronpedia</strong> (Interpretability)
	Mechanistic Interpretability für transparentes Agent-Training.
	<a href="https://www.neuronpedia.org/gemma-scope-2" class="link">neuronpedia.org/gemma-scope-2</a>
	• Discovery: Wichtige Neuronen/Circuits finden
	• Steering: Agent-Verhalten aktiv beeinflussen
	• Freezing: Gelernte Patterns fixieren

	<strong>7. Custom MCP Skills</strong> (Eigene erstellen)
	Du kannst eigene MCP Skills erstellen und ins Docker Dynamic MCP Gateway integrieren.
	<a href="https://docs.docker.com/ai/mcp-catalog-and-toolkit/authoring/" class="link">Docker MCP Authoring Docs</a>
	• Eigene Tools schreiben und publishen
	• Skills im MCP Catalog teilen
	• On-demand für andere verfügbar machen

	<strong>8. On-Demand Agents</strong> (LLM Collections)
	Spezialisierte Agenten on-demand aus Model-Mixen erstellen.
	• Code-Agent: Claude Sonnet + GPT-4o + DeepSeekCoder
	• Research-Agent: Claude Opus + Qwen2.5 + Llama 3.1
	• Writing-Agent: GPT-4o + Gemini + Mistral
	• Local-First: Llama, Qwen, DeepSeek lokal
	• <span style="color: #667eea;">Auf hohem Niveau spezialisieren durch Model-Mix!</span>

	<strong>9. DSPy + GEPA</strong> (Reliability Layer)
	Prompt-Optimierung durch Reflection statt Reinforcement Learning.
	• <strong>DSPy:</strong> LLMs wie CPUs/GPUs - deklariere Signaturen, nicht Prompts
	• <strong>GEPA:</strong> Genetic Pareto Prompt Optimizer mit Reflection
	• <strong>Agentic RAG:</strong> Confidence-based + Multihop mit "Ich weiß es nicht"
	• <span style="color: #667eea;">35x effizienter</span> als herkömmliche Optimierung
	• <span style="color: #667eea;">9x kürzere Prompts</span> bei 10% besserer Performance
	• <span style="color: #667eea;">Firmen-Ready:</span> Konsistente Outputs, keine Halluzinationen

	<strong>Die Kombination:</strong> Gateway liefert Infrastruktur + Catalog (272+ Server), Docker MCP ist Standard, HuggingFace/Codex liefern AI-Fähigkeiten, Beads tracked Tasks, Gemma Scope macht Training transparent, Custom Skills erweitern das Ökosystem, On-Demand Agents spezialisieren auf Tasks, DSPy+GEPA garantiert Verlässlichkeit.

	Plus im Gateway: GitHub, Sentry, Z-Image, Web-Search, Browser Automation

	<strong>Alle Tools on-demand</strong> - verfügbar wenn benötigt, ohne Token-Overhead!

	<span style="color: #667eea;">━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>

	<strong>🔓 Freedom & Ownership</strong>

	<strong>Anti-Vendor-Lock:</strong>
	• Selbst-hosted mit Docker MCP Gateway
	• <span style="color: #667eea;">On-Demand Agents mit beliebigem Model-Mix</span>
	• Wechsle zwischen Models (Anthropic, HF, OpenAI, Local)
	• Keine Cloud-Bindung durch lokale Container
	• Open-Source Stack voll austauschbar

	<strong>Data Ownership:</strong>
	• Beads: Git-backed - deine Tasks gehören dir
	• Datasets: HF Hub mit eigenen Repos
	• Models: Lokale Finetuning-Ergebnisse
	• Steering-Vektoren: Exportierbar und wiederverwendbar
	• <span style="color: #667eea;">DSPy Prompt-Evolution: Baum der optimierten Prompts gehört dir</span>

	<span style="color: #667eea;">━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span></div>
	</div>

	<div class="post">
	<span class="tag">Agent Training Loop</span>
	<h2>🔄 Self-Improving Agent Loop</h2>
	<div class="date">30. Dez 2025 • Closed-Loop AI Agent Training</div>
	<div class="content"><strong>Die Vision:</strong> Ein Agent, der sich selbst verbessert durch iterative Schleifen.

	<strong>Die Komponenten:</strong>

	<strong>1. Ralph Wiggum</strong> (Loop Engine)
	Iterative AI-Agentenschleifen mit selbstreferenziellem Feedback.
	<a href="https://github.com/anthropics/claude-code/tree/main/plugins/ralph-wiggum" class="link">Ralph Wiggum GitHub</a>
	• /ralph-loop startet die Schleife
	• Stop-Hook fährt Resultat ein
	• /cancel-ralph bricht ab

	<strong>2. Beads</strong> (Task Memory)
	Git-backed graph issue tracker für Tasks.
	<a href="https://github.com/steveyegge/beads" class="link">Beads GitHub</a>
	• Tasks als Graph-Nodes gespeichert
	• Dependencies und Blocker sichtbar
	• Git-backed - jeder Loop ist versioniert

	<strong>3. Docker MCP Server</strong> (Container Runtime)
	Alles läuft in Containern - reproduzierbar und isoliert.
	<a href="https://docs.docker.com/ai/mcp-catalog-and-toolkit/server-docker/" class="link">Docker MCP Server Docs</a>
	• Container <span style="color: #667eea;">on-demand</span> erstellen
	• Nach Gebrauch automatisch <span style="color: #667eea;">kill & cleanup</span>
	• <span style="color: #667eea;">Alle mit Docker Sandboxes!</span>
	• GPU-Container für ML Workloads

	<strong>Zwei Disziplinen für Agent-Verbesserung:</strong>

	<span style="color: #667eea;">━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>

	<strong>🔧 PATH A: Finetuning</strong> (Permanent)
	<a href="https://github.com/huggingface/skills" class="link">HF Skills GitHub</a>
	• <strong>Was:</strong> Model-Gewichte werden dauerhaft geändert
	• <strong>Wie:</strong> SFT, DPO, GRPO auf HF Jobs
	• <strong>Resultat:</strong> Neues Model mit gelerntem Verhalten
	• <strong>Dauer:</strong> Permanent
	• <strong>Vorteil:</strong> Gelerntes Wissen bleibt erhalten

	<strong>🎯 PATH B: Steering</strong> (Runtime)
	<a href="https://www.neuronpedia.org/gemma-scope-2" class="link">Gemma Scope 2 + Neuronpedia</a>
	• <strong>Was:</strong> Verhalten zur Laufzeit beeinflussen
	• <strong>Wie:</strong> Activation Engineering / Feature Steering
	• <strong>Resultat:</strong> Verändert Output ohne Gewichtsänderung
	• <strong>Dauer:</strong> Nur während Inference
	• <strong>Vorteil:</strong> Reversible, kein Retraining nötig

	<span style="color: #667eea;">━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>

	<strong>Beide Pfade kombinieren:</strong>

	<strong>Discovery Skills</strong> (Gemma Scope 2 + Neuronpedia)
	• SAE Features finden die Verhalten bestimmen
	• Circuits identifizieren (Kausal-Ketten)
	• 4TB+ activations, explanations, metadata

	<strong>Steering Skills</strong> (Runtime Control)
	• Feature-Stärke erhöhen/verringern (↑/↓)
	• API: POST /api/steer mit strength_multiplier
	• Sofortige Wirkung ohne Training

	<strong>Freezing Skills</strong> (Persistenz)
	• Wichtige Circuits identifizieren und speichern
	• Erfolgreiche Patterns in Finetuning übernehmen
	• Agent-Verhalten konsistent halten

	<strong>Der Loop mit beiden Disziplinen:</strong>
	1. Ralph startet → Agent führt Task aus
	2. Beads tracked → Graph speichert Fortschritt
	3. Docker MCP → Container on-demand erstellen
	4. Agent arbeitet → Isoliert im Sandbox-Container
	5. <span style="color: #667eea;">[PATH A]</span> HF Skills → Finetuning für permanentes Lernen
	6. <span style="color: #667eea;">[PATH B]</span> Gemma Scope → Activations analysieren
	7. <span style="color: #667eea;">[PATH B]</span> Neuronpedia → Discovery: Features finden
	8. <span style="color: #667eea;">[PATH B]</span> Steering → Laufzeit-Korrektur
	9. <span style="color: #667eea;">[BOTH]</span> Freezing → Erfolgreiche Patterns fixieren
	10. Container cleanup → Automatisch kill & löschen
	11. Loop wiederholt → Verbesserter Agent

	<strong>Use Cases:</strong>
	• Code-Refactoring Agent trainieren
	• Bug-Finding Skills verbessern
	• Domain-spezifische Tasks optimieren

	<strong>Links:</strong>
	<a href="https://github.com/anthropics/claude-code/tree/main/plugins/ralph-wiggum" class="link">Ralph Wiggum GitHub</a>
	<a href="https://github.com/steveyegge/beads" class="link">Beads GitHub</a>
	<a href="https://docs.docker.com/ai/mcp-catalog-and-toolkit/server-docker/" class="link">Docker MCP Server</a>
	<a href="https://github.com/huggingface/skills" class="link">HF Skills GitHub</a>
	<a href="https://huggingface.co/blog/hf-skills-training" class="link">HF Skills Blog</a>
	<a href="https://www.neuronpedia.org/api-doc" class="link">Neuronpedia API</a>
	<a href="https://deepmind.google/blog/gemma-scope-2" class="link">Gemma Scope 2 DeepMind</a></div>
	</div>

	<div class="post">
	<span class="tag">Claude Code Plugin</span>
	<h2>🐑 Ralph Wiggum</h2>
	<div class="date">Anthropic Official Plugin</div>
	<div class="content">Ein Claude Code Plugin für iterative AI-Agentenschleifen.

	<a href="https://github.com/anthropics/claude-code/tree/main/plugins/ralph-wiggum" class="link">GitHub: anthropics/claude-code/plugins/ralph-wiggum</a>

	Was es macht:
	• /ralph-loop - Startet iterative Schleife
	• /cancel-ralph - Bricht die Schleife ab
	• Stop-Hook erstellt selbstreferenzielle Feedback-Schleife

	Benannt nach Ralph Wiggum (Simpsons) - "I'm a member of the 508th Airborne!"</div>
	</div>

	<div class="post">
	<span class="tag">Mechanistic Interpretability</span>
	<h2>🔬 Gemma Scope 2 + Neuronpedia</h2>
	<div class="date">AI Interpretability Stack</div>
	<div class="content">Das komplette Ökosystem für mechanistic interpretability:

	<strong>Gemma Scope 2</strong> (Google DeepMind)
	• 110 Petabytes an Daten für Gemma 3 (270M-27B)
	• SAEs und Transcoders
	• <a href="https://deepmind.google/blog/gemma-scope-2" class="link">DeepMind Blog</a>

	<strong>Neuronpedia</strong> (Open Source)
	• Interactive Steering Platform
	• Explore & Experiment mit gemscope-2
	• <a href="https://neuronpedia.org/gemma-scope-2" class="link">neuronpedia.org/gemma-scope-2</a>

	Zusammen: Transcoders übertreffen SAEs • Circuit Insights • Real-time Steering</div>
	</div>

	<div class="post">
	<span class="tag">HuggingFace</span>
	<h2>🤗 HuggingFace Skills</h2>
	<div class="date">Available Skills</div>
	<div class="content">In deiner Claude Code Installation verfügbar:

	• model-trainer - SFT, DPO, GRPO Training
	• hugging-face-paper-publisher - Papers veröffentlichen
	• hugging-face-dataset-creator - Datasets erstellen
	• hugging-face-evaluation-manager - Eval Ergebnisse

	Über Skills-Tool verfügbar.</div>
	</div>
	</div>
	</body>
	</html>