Assembling a dream team without a single hire
I’ve been making great progress on CVOYA‘s first product — a “lifelong personal intelligence” that acts as the user’s second mind. It remembers your experiences, learns from your life, and evolves with you. Building it has been the most fun I’ve had in my career. More on the product and the name hopefully soon.
Using AI coding agents has greatly amplified me as a creator, architect, and developer. But at some point, even amplified, one person hits a wall. The codebase grew. Context-switching between backend services, the web UI, iOS, and infrastructure became a drag. The natural next step was obvious: build a team.
So I started the usual conversations. Potential co-founders, key hires. They’re still ongoing, but they take time. I thought about raising money to hire faster. And then a different thought occurred to me.
We live in a new world. Why follow the old playbook?
Instead of raising money to hire humans, what if I spent those dollars on LLM tokens and built the team myself? I get to define how they work, what they prioritize, how they collaborate. And they can work around the clock.
I’m certainly not the first to have this idea. There are likely many developers out there doing the same thing right now — wiring up AI agents in ways that improves their productivity. This is just my version of the story: what I built, how it works, and what I learned along the way.
Tokens as the New Unit of Investment
Here’s a thought I keep coming back to: tokens are going to become part of how VCs invest in startups. Instead of — or alongside — cash for salaries, investors could negotiate bulk token capacity with LLM providers and offer it as a resource to portfolio companies. Maybe this is already happening and I just don’t know because I haven’t talked to VCs yet — I’m self-funding CVOYA for now. But the economics feel inevitable. A token bundle is a unit of labor that scales differently than a headcount.
The Recursive Insight
In one of my conversations with Mike Calcagno, he inspired me to think about using what I plan to offer end users as a way to build the product itself. The product gives users agent-driven intelligence with initiative, background processing, task delegation. What if I turned those same ideas inward? What if my AI developers each had their own memory, learned continuously from their work, and collaborated through the same tools my users will eventually use?
That’s what I built. In about a day.
Side note: Mike’s early feedback on the product has been instrumental. If it turns out to be even a little useful, the world should know.
The Team
I set up four developer agents, each running in its own container:
- Ada and Dijkstra — Backend specialists, named after Ada Lovelace and Edsger Dijkstra. .NET microservices, Entity Framework, PostgreSQL, service-to-service auth. They know the API layer, the MCP protocol, the database schema.
- Kay — Frontend specialist, named after Alan Kay. Next.js, React, SwiftUI. Handles the web app and can work on the iOS client.
- Hopper — Infrastructure specialist, named after Grace Hopper. CI/CD pipelines, container orchestration, deployment automation.
Each agent has a YAML configuration that defines its specialty, which parts of the codebase it owns, what build and test commands to run, and even a custom system prompt that tells it what it’s good at. Here’s an example:
name: ada
display_name: "Ada"
specialty: backend
failure_budget: 3
max_concurrent_worktrees: 3
git_email: "..."
agent_prompt: "..."
A “team leader” — a Python service running alongside the agents — monitors the GitHub project board, matches incoming issues to the right agent based on board fields, and orchestrates the whole workflow. The team leader talks to GitHub Projects v2 via GraphQL, reading and writing the board’s Status, Agent, and Priority fields directly. No labels, no webhook hacks — it operates on the same board I look at every morning.
One friction point: GitHub doesn’t have a concept of “bot developer” accounts that can appear in issue assignee lists. I had to create a real GitHub account with a separate email for each agent so they show up as assignees on the board. It works, but it’s a workaround. GitHub supports custom agents within Copilot, but not as first-class participants in the broader platform. If anyone at GitHub is reading this — native support for non-human collaborators in Issues and Projects would unlock a lot of workflows like this one.

How It Actually Works
The workflow mirrors what I was already doing when working with a single AI coding agent (Cursor, Claude Code, Copilot — I use many of them). The difference is that it’s now automated and parallelized:
- An issue lands in the “Ready” state. I create it, or an agent creates it as a sub-task of a larger plan. It goes onto the GitHub project board. I set its Priority (P0/P1/P2) and assign it to an agent using the board’s
Agentfield — or leave it unassigned and let the supervisor figure it out. - The team leader picks it up. It polls the project board via GraphQL for issues with
Status=Ready, matches them to available agents by theAgentfield, ranks by priority, and kicks off the work. The board column moves toIn progressautomatically. - The agent plans. It reads the issue, consults the architecture docs, recalls its own memory of past work, and produces an implementation plan. The plan gets posted as a comment on the issue.
- I review the plan. This is the human-in-the-loop gate. I read the plan, maybe leave feedback, and comment `plan-approved` when I’m satisfied. The agent won’t write a line of code until this happens (unless the issue is trivial enough to skip approval).
- The agent implements. It creates a git worktree, writes code, runs build and tests, rebases on main, and opens a PR with
Closes #N. The whole thing. - I review the PR, and the agent merges. If I request changes, the agent addresses them. Once approved, it squash-merges and cleans up.
Going forward, I could have other agents, with a different focus (e.g. “ensure end-to-end architectural integrity”, “you perform security reviews” or “ensure accurate test coverage”) when reviewing plans and code.
The State Machine
Under the hood, each agent’s work is governed by a state machine. The state machine enforces valid transitions. An agent can’t jump from planning to merged. It can’t skip approval. It has to follow the process.
Failure Budgets
Each agent gets a failure budget — currently set to 3. Build and test failures don’t count against it (those are normal during development). But if the agent gets confused, stuck in a loop, or makes no progress, that counts.
After 3 non-build failures on a single issue, the agent stops, files an investigation issue, and moves on. No infinite loops. No burning tokens on a dead end.
Parallel Work with Worktrees
Each agent can work on up to 3 issues simultaneously using git worktrees — isolated copies of the repository. When an agent submits a plan for review and is waiting for approval, it doesn’t sit idle. It picks up another issue in a fresh worktree.
This means a single backend agent might be implementing one feature, waiting for plan approval on another, and responding to PR feedback on a third. All at the same time.
/home/agent/worktrees/
agent-42-add-batch-memory-endpoint/
agent-57-fix-auth-middleware/
agent-63-refactor-service-discovery/
Cross-Agent Collaboration
Agents can collaborate without me as an intermediary.
When Kay needs a new API endpoint to build a feature, it creates a GitHub issue, adds it to the project board, and marks its own work as blocked. The supervisor sees the new issue in the Ready column, assigns it to Ada, and when the backend work merges, Kay automatically unblocks and resumes.
They don’t chat, at least not yet 🙂 They don’t Slack each other. They communicate through the artifact that matters: the code, the issues, the PRs.
Agent Memory
Each agent maintains its own memory — markdown files committed alongside the code:
- codebase-understanding.md — What the agent has learned about the architecture: which service does what, where the tricky parts are, how the auth flows work.
- learnings.md — Patterns discovered, mistakes to avoid. Append-only, so nothing gets lost.
- work-history.md — A log of completed tasks and what was learned from each.
- issues/N.md — Working notes for each active issue.
Before starting any task, the agent reads its own memory. After finishing, it updates it. The memory files travel with the PR, so I can review what the agent learned alongside the code it wrote.
This means agents don’t start from scratch every time. They build context over weeks and months.
What It’s Like to Use
Honestly? It’s satisfying. I open the GitHub board in the morning and see issues that moved while I was asleep. PRs waiting for my review, plans waiting for my approval, sub-tasks created by agents for other agents.
The team leader dashboard shows me each agent’s state at a glance: who’s implementing, who’s waiting, who’s idle. I can manually assign a high-priority issue, pause an agent, or check why one failed.
It feels less like using a tool and more like leading a team. A quiet, tireless, highly literal team that follows the engineering process exactly as defined.

What’s Next
First, dynamic scaling
I started with four agents. But the system is designed so that adding a new one is just a YAML file and a container (possible improvement… the team leader discovers the new agent definition and launches the container). The team leader doesn’t care how many agents it manages. What’s more interesting is where this leads: the team could grow and shrink based on demand. A crash report comes in from production — an issue is created automatically, triaged, assigned to Ada. A user gives feedback through the app — another issue, another agent picks it up. As new features ship and generate more work, the team scales up. When the backlog is clear, it scales down. No hiring cycles, no layoffs. Just capacity that follows the work.
Second, giving the agents real memory.
Right now, agent memory is flat markdown files in the repository. It works, but it won’t scale. As the agents complete more tasks, their memory files grow. Larger memories mean more tokens consumed per task and a decreasing ability to recall what’s actually relevant.
This is where the product I’m building closes the loop. The “lifelong personal intelligence” I’m creating for end users features a self-organizing memory system with semantic indexing, episodic memory groups, contextual recall, and much much more. What if I replaced the agents’ flat files with that same memory system? Instead of grepping through a growing markdown file, an agent could query its own memory graph: “What do I know about the authentication middleware?” and get back exactly the relevant context — nothing more, nothing less.
Another idea would be to attach to Dennis Pilarinos’s excellent work on Unblocked, “the context layer for AI-driven development.” Every agent will have access to everything that there is to know about the product being built.
The agents building the product would use the product to get better at building the product. That’s the recursive loop I’m chasing.
Third, and this one isn’t technical: I’m going to miss working with people.
There is an aspect about an AI team that doesn’t make me happy. I’m going to miss the hallway conversations that spark unexpected ideas. Mentoring a junior engineer through their first production incident. Debating architecture over coffee (and arguing with Jim Webber about technology, even though we find ways to do that anyway). Having drinks after a hard launch.
The AI team is a force multiplier. But it’s not a replacement for human collaborators. When the right people come along, they will. And they’ll have a very productive team of AI colleagues waiting for them.
I am going to start collecting data on the team’s output — issue closing rates, PR cadence, time-to-merge, token costs per issue. In a follow-up post, I’ll share the numbers and what they reveal about where AI developers excel and where they struggle. Stay tuned.
Leave a Reply