008: Creating CPE

25 Feb, 2026

Building CPE: From Chat Tool to Programmable Agent Harness

I've been working on a project called CPE for the past eight months, and I've used it daily since day one. What started as a simple chat-based code editor — inspired by aider back in July 2024, which in AI years is a lifetime ago — has evolved into something quite different: a general-purpose, self-assembling agent harness that power users can configure exactly the way they want.

Today, CPE lets you define your own system prompts, connect any tool via MCP with selective filtering, compose agent skills, manage conversations with forking and cross-model resumption, and — most importantly — run a Code Mode where the AI writes and executes Go programs that call your tools as typed functions. But it took a while to get here.

The Problem

CPE was born before the era of sidebar agents like Cursor's Composer or Windsurf's Cascade, and before the command-line agents that are everywhere today. At the time, most code assistants used only a fraction of the available context window, relying on semantic search to save on tokens — understandable when plans were subscription-based rather than pay-as-you-go.

The thing is, with a little experimentation, I could clearly see that as long as I provided the right context myself, the model generated code that was plausibly something I would write — just much faster. I didn't need a fancy retrieval system. I needed control.

I had a few specific frustrations:

Context control. I wanted to decide exactly what the model sees, not hope that some embedding-based retrieval picked the right files.
System prompt ownership. I wanted to define my own guardrails — what to do, what not to do, how to verify its own work — without wondering whether my .cursorrules file was being ignored or conflicting with the application's built-in system prompt.
Minimalism. I didn't want a VS Code fork, an extension, a Python runtime, or even a TUI. Just a binary I could go install, pipe input into, and read output from. Something that prints to the terminal and gets out of the way.
Experimentation. I had ideas I wanted to try. Connecting an LSP to the agent via tool definitions. A string-replace tool that used tree-sitter to verify edits hadn't introduced syntax errors. Wild stuff — some of it worked, some didn't.

The Journey

Phase 1: Pipe In, Get Text Out

CPE started as bare-bones as it gets. Pipe text via stdin, get a response. Copy code into a file, pass it as input, get analysis back. Then I added tools so the LLM could actually modify the local filesystem. Then utility commands: token counting per file, a tree view showing how much each file contributed to the total token budget. Useful, but still pretty simple.

Phase 2: Going All-In on MCP

When the Model Context Protocol came along, I had a realization: the LLM only ever interacts with tools. What matters is the flexibility to design your system prompt in conjunction with the tools you provide for a given task. Everything else is plumbing.

So I made a sharp pivot: CPE became a thin MCP client. I stripped out all the built-in functionality — even file reading and writing — and created separate MCP servers for everything. Need to list files? That's an MCP server. Need to edit text? Another MCP server. Along the way, I'd spin up minimal servers for whatever I needed or adopt existing third-party ones.

This was philosophically clean. In practice, it had problems.

Phase 3: The MCP Problem (and Discovering Code Mode)

As many in the ecosystem have learned, MCP can easily overload the context window. Install a popular MCP server, get excited about the capabilities — and then realize it exposes 30 tools with verbose descriptions that confuse the model and bloat every request.

I tried to make it work. I looked at overriding tool descriptions or replacing schemas, but that felt like fighting against MCP's design rather than working with it. The real issue was deeper: I wanted composability. I wanted the model to chain multiple tool calls, use conditionals, loop over results — all things that the one-tool-call-at-a-time paradigm doesn't support well.

Then I read the Cloudflare blog post on Code Mode and Anthropic's post on programmatic tool calling, and it clicked. Instead of exposing tools as individual actions the model invokes one at a time, what if the model could write a program that calls tools as functions?

Phase 4: Code Mode in Go

The initial prototype tried TypeScript, but I landed on Go for the execution language. The result: CPE exposes MCP tools as typed Go functions — with structs generated from each tool's input and output schema, and the tool description as the function's doc comment. The model generates a complete Go program implementing a Run function, CPE compiles and executes it, and the result comes back.

This opened up everything. The model can:

Compose multiple tool calls in a single execution
Loop over files, search results, or API responses
Branch based on intermediate results
Import Go standard library packages for data processing, HTTP requests, file manipulation
Run things in parallel using goroutines and errgroups

Instead of a dozen round-trips between the model and the tool server, the model writes one program that does it all. A frontier model can one-shot these compositions in seconds.

Phase 5: Sub-Agents as Go Functions

The final piece fell into place when I realized that CPE itself could be exposed as an MCP server — and since any MCP tool becomes a Go function in Code Mode, sub-agents became just function calls.

The orchestrating agent can spin up a sub-agent to handle a tangentially related task, get back a result, and continue — without that sub-task consuming any of its own context window. The sub-agent has its own context, its own system prompt, its own conversation. And because it's a Go function, you can call it inside a goroutine, inside a loop, conditionally, with templated inputs. The outputs can be parsed, filtered, or aggregated.

This is a fundamentally different model from how most agent frameworks handle delegation, and it's the pattern I'm most excited about exploring further.

Where It Is Today

I'm genuinely happy with where CPE is. Despite the crowded landscape of coding agents — both open-source and proprietary — it occupies a distinct niche: a minimalist, configurable harness where the user controls the system prompt, the tools, and the execution model.

Code Mode has been the standout success. I've used it for far more than software engineering:

Debugging macOS storage issues after an upgrade filled my disk in ways I couldn't understand — CPE helped me trace and clean up the culprits.
Diagnosing iMessage sync problems by inspecting local databases and logs.
Managing my email through a skill that imports a third-party IMAP library to talk to Gmail's servers — searching, labeling, deleting, all through natural language.

Programming and software engineering are increasingly different things. LLMs are already remarkably good at programming — translating intent into code. CPE has become less of a "coding agent" for me and more of a general-purpose computer-use tool.

What's Next

A few things I want to explore:

Compaction

Modern context windows are large, and between Code Mode and sub-agents, I can handle most tasks without hitting the limit. For the remaining 5-10% of cases, I currently use a manual workaround: run a skill that produces a compaction summary, then start a new conversation with that summary as input. It works, but I'd like to make it a first-class feature — or better yet, make it possible through the hooks system described below.

Agent Loop Hooks

I want to add lifecycle hooks to the agent loop: events that fire before and after tool calls. The obvious uses are validation (linting after file edits, running tests after code changes), observability, and security (blocking potentially destructive operations before they execute).

The tricky part is making hooks flexible enough to modify the conversation, not just observe it. If hooks could rewrite the conversation state on the fly, compaction might not need to be a built-in feature at all — it could just be a hook that triggers when context usage crosses a threshold. But mutable hooks interact poorly with conversation persistence and cross-model resumption, so I'm still thinking this through. For now, the models follow instructions well enough that I can put pre/post-tool-call behaviors directly in the system prompt.

An Agent Standard Runtime

This is the most speculative idea, but potentially the most powerful.

Code Mode already runs arbitrary Go code, and my skills already import third-party libraries (the email skill uses an IMAP library to talk to Gmail). The model can discover how to use unfamiliar libraries on its own via go doc, which lets it inspect types, functions, and documentation for any Go package. If its parametric knowledge fails, it can just go look.

But raw third-party libraries are often too low-level. For the email case, it would be far more productive if the model could import a high-level module with functions like SearchEmail, GetThread, DeleteThread, LabelThread — rather than constructing IMAP commands from scratch each time.

I want to build a set of these purpose-built Go modules: an "agent standard runtime" of utilities that Code Mode can import. Structure-aware file editing (e.g., a ReplaceGoFunction that finds a function by name and replaces it, saving tokens on reproducing the original text). High-level wrappers for common services. Helpers for data processing.

This echoes the Voyager paper — the famous work where an LLM in Minecraft created its own tools to make progress. The same pattern applies here: the agent, directed by the user or autonomously, creates utilities, documents them, and reuses them later. Other agents could share these modules. It looks like MCP or Claude Code plugins, but it's just Go modules — usable in Code Mode, in standalone Go scripts, or by any agent that can write Go.

I think this is a powerful pattern for building general-purpose agents, and I'm excited to see where it leads.

If any of this resonates, CPE is on GitHub. It's MIT-licensed, written in Go, and installs with a single go install. I'd love to hear what you think.