I Built an Agentic Ferrari in Rust… and Nobody’s Driving It

Community Article Published February 23, 2026

I built an agentic Ferrari in Rust.

It’s fast, ridiculously low overhead, and honestly a little absurd: multi-agent orchestration, tool routing, local memory, event streaming, safety gates—the whole thing.

And right now… almost nobody’s driving it.

orchastrator

Developers get workflows. Everyone else gets chatboxes. Tandem brings workflows to everyone.

The brutally honest origin story

This started because I wanted Anthropic’s Cowork on Windows. I thought the idea was that good.

The bigger motivation is the gap: developers get real AI workflows (CLI/IDEs), while everyone else gets a chat box. I can use the dev tools — the problem is most people can’t. Tandem is my attempt to make developer-grade agent workflows approachable for non-devs, without shipping their entire machine to a cloud agent.

At first I used OpenCode to move fast. But once I cared about owning the rules—custom endpoints, tool semantics, streaming events, safety gates—I hit the wall.

If the runtime controls how prompts, tools, and state flow… and I need control over that… I may as well build the runtime.

That’s when “a Tauri app” turned into a full local agent engine.

Tandem isn’t a desktop app. It’s an engine with clients.

Tandem is a headless agent runtime written in Rust. The UI is just a client.

Today it ships with:

  • tandem-engine (Rust): orchestration, tools, memory, event streaming
  • Desktop app (Tauri + React): Plan Mode, visual diffs, approve-to-execute UX
  • TUI: a terminal cockpit for the same engine
  • Headless VPS Portal: 9 working React examples (Deep Research, Swarms, Incident Triage) to show how easy it is to build custom clients
  • Guides to build your own GUI/clients on top of the engine API

Ferrari translation:

  • Engine = tandem-engine
  • Dashboard = Desktop UI
  • Track cockpit = TUI
  • Telemetry = event stream

Why Rust (the real reason)

I wanted to push horsepower downward.

Agent systems aren’t one-off calls. They’re loops: plan → act → observe → revise. That means the runtime ends up doing a lot of “boring but heavy” work constantly:

  • managing orchestration state (like my supervised "Planner → Builder → Validator" sub-agent loop)
  • streaming structured events
  • tool routing + retries + budgets + sandboxed Python venv execution
  • indexing and memory reads/writes
  • parsing noisy web pages or extracting text from PDFs/DOCXs into something models can actually use

Rust is where I want that work to live.

Then frontends can do what they’re best at:

  • look polished (React + motion)
  • stay responsive
  • render plans, diffs, timelines, logs
  • avoid becoming a spaghetti bowl of orchestration logic

This separation is the whole point: engine = responsibility, clients = experience.

Provider freedom (because vendor lock-in sucks)

Tandem isn't tied to a specific model. Bring your own API key (OpenRouter, Anthropic, OpenAI) or go completely offline with local models via Ollama. The engine doesn't care; it simply routes and executes.

The trust problem: Choosing your level of control

Letting an LLM blindly write files “live” in the background is a quick way to lose trust forever.

Instead of a one-size-fits-all approach, Tandem lets you dial the trust level up or down depending on what you're doing.

In regular chat sessions, you pick the exact mode per prompt:

  • Ask: Q&A without making any file changes at all.
  • Explore: Analyze and explore the codebase safely.
  • Immediate: Execute changes directly for quick, low-risk edits.
  • Plan (Zero Trust): The agent proposes a staged execution plan, the UI shows visual diffs, and a human explicitly clicks Execute.
  • Coder: Focused specifically on code generation.
  • Orchestrate: An AI plans and executes multi-step workflows.

For serious architectural work, you enter the Command Center: A dedicated cockpit for launching orchestrator missions and managing manual swarm interventions. You give the objective, and the engine coordinates Planner/Builder/Validator sub-agents while giving you live telemetry on tokens, runtime, and tasks.

For background work, there's Agent Automation (WIP): A separate hub for Scheduled Bots (like Daily Research or Issue Triage) and MCP Connector operations where you set explicit bounds and let it run.

Plan Mode is slower than “just do it,” but having that safety net is how you make local agents feel safe enough to keep installed. (I also encrypt your API keys locally using AES-256-GCM, because I mean it when I say "developer-grade".)

command_center

Local memory without the usual “install a database” tax

Most “local-first” tools quietly stop being local-first the moment you add RAG and require external services.

I wanted long-term memory without asking users to run Postgres/pgvector/Pinecone/etc.

So memory lives locally (SQLite + embedded vector search via sqlite-vec). It keeps setup friction low and makes the engine feel like an actual local tool, not a mini devops project.

So why is the Ferrari parked?

Because capability isn’t adoption.

I built the engine. I shipped two clients. I wrote docs. It works.

But most people don’t wake up looking for “an agent runtime.” They want a workflow that succeeds in 60 seconds, onboarding that’s obvious, and trust that’s earned immediately.

A Ferrari is useless if:

  • nobody knows where the keys are, or
  • they’re scared it’ll go through the garage door.

Help me find drivers (or build a better steering wheel)

If you try Tandem and bounce in the first 5 minutes, I want to know where. That feedback is worth more than compliments.

And if you build a client on top of the engine: I’ll happily link it in the docs.

Over-engineered? Probably. 😄
Necessary? Also probably. Because the moment you want safety gates, streaming state, memory, and multiple clients, you’re not building an app anymore—you’re building a runtime.

And that’s the real punchline: Tandem isn’t “one UI.” It’s a local engine that can serve dozens or even hundreds of clients on the same machine—Desktop, TUI, tiny custom dashboards, scripts, automations, whatever you want—without rebuilding the core. (I just shipped 9 example dashboards in my vps-web-portal to prove it).

The Ferrari isn’t the dashboard. It’s the engine.

llama-swap

Community

Sign up or log in to comment