My multi agent setup

A pricing change burned through a month of my Copilot budget in two days, so I rebuilt my AI dev stack around Claude Code, Cursor, Antigravity, and OpenCode. Here's what's running now, and what each tool actually costs.

โ† Back to Blog

Hey ๐Ÿ‘‹

Quick rant before we get into it: GitHub Copilot burned through a month of my budget in under two days after a pricing change I didn't see coming. That was the push I needed to actually rebuild my AI dev stack instead of just grumbling about it.

I'd poked at Claude and Cursor before, on and off, but Copilot was always home base. Running DMS solo and building Booker Blitz means there's no team to lean on when a tool stops working for me, so I gave myself a weekend to fix it properly instead of patching around it.

A short, unsatisfying detour through Windsurf later (we don't need to talk about that one), I landed on a setup with two primaries and two backups: Claude Code and Cursor doing the heavy lifting, Antigravity and OpenCode covering the gaps.

Claude Code: the brain

Claude Code is where I plan. It holds onto multi-turn project discussions and brainstorming sessions better than anything else I've used in an IDE, which makes it the natural place to draft battle plans and build out core project skills before any code gets written.

The part that actually sold me, beyond the planning, is how easy it is to wire up MCP servers. I never found solid Capacities or TickTick plugins in any marketplace, so I built my own connectors. Claude now doubles as a centralized organizer that works the same way across every machine I touch.

One opinion I'll defend: skip the big pre-baked "Getting Things Done" or "Superpower" style frameworks. They bloat your environment with commands you'll never run. Writing lean, product-specific skills instead keeps token spend down and your setup easy to reason about. Those bigger frameworks aren't bad, they're just overkill for focused, specialized work.

Cursor: the workhorse

Claude is smart, but it chews through token budget fast on repetitive, everyday coding. That's the gap Cursor fills. It's the cheaper option for day-to-day implementation, and it balances out how token-hungry Claude gets on deep logic work.

What sold me on Cursor, though, was its cloud agents and how closely they match the agent workflows I'd built elsewhere. It plugs into Linear directly: assign an issue, Cursor analyzes the workspace and opens an initial pull request from the agent itself. That keeps iteration fast without burning through premium limits.

Antigravity and OpenCode: the backups

Antigravity has one job in this setup: front-end work. Both its CLI and desktop app are clunky, especially on Linux, so it stays out of backend and general logic work entirely.

Where it earns its place is visual implementation. Feed it a screenshot, say "make this screen better," and it gets the layout right. Cursor will sometimes claim a UI bug is fixed when it visibly isn't. Antigravity actually delivers.

OpenCode isn't a primary by any stretch (that's Claude Code's job), but it's a reliable utility player for the boring stuff: updating project memory, keeping MEMORY.md and other harness files current. I picked it up during a half-price first month, I'm on its free model tiers now, and it's held up fine for routine, programmatic documentation work.

๐Ÿ’ธ What it all actually costs

Here's the part every tool comparison skips or buries in fine print. Figures below are current as of June 2026 and move fast in this space, so treat them as a snapshot, not a promise.

ToolPricingWindowsLinuxMac
Claude CodePro $20/mo (includes Claude Code); Max $100 or $200/mo; Team Premium seats from ~$100/seat/mo; API billed per tokenNative, no WSL requiredNativeNative
CursorFree Hobby tier; Pro $20/mo; Pro+ $60/mo; Ultra $200/mo; Teams $40/seat/moNativeNativeNative
GitHub CopilotFree tier; Pro $10/mo; Pro+ $39/mo; Max $100/mo; Business $19/seat/mo; Enterprise $39/seat/moNative (VS Code, Visual Studio, JetBrains)Native (VS Code, JetBrains, Neovim)Native
AntigravityFree in public preview; ties into Google AI Pro ($20/mo) or Ultra ($100 to $200/mo) for priority access and higher quotasNative, needs Windows 10 (1809+) or 11Native on supported 64-bit distros (Ubuntu 20+, Debian 10+, Fedora 36+, RHEL 8+)Native
OpenCodeFree and open source, bring your own API key; optional managed tier from around $10/moNative, CLI and desktop betaNative, CLI and desktop betaNative, CLI and desktop beta

Copilot is the only one with zero platform friction, since it's a plugin riding inside editors that already run everywhere. Antigravity sits at the other end. Cross-platform on paper, but the Linux build has the pickiest requirements of the group, and it shows.

The agents doing the actual work

On top of the two primaries, I run a small fleet of custom agents and skills:

Project Agent Creator Skill scans a project's dependencies and architecture, then decides what needs an isolated autonomous agent and what's better served by a lightweight skill.

Project Planner holds deep context on the codebase and the roadmap, breaks down complex tasks, and writes a strategy file straight into the repo as plan.md. That file is the real insurance policy. If I burn through token budget on one provider mid-task, I hand plan.md to another, Claude to Cursor, say, and keep moving without losing the thread.

The Orchestrator acts as project manager for the whole fleet. Once a task list is locked, it splits the work into micro-tasks, assigns the right agent to each, and keeps plan.md updated in real time.

Code Pusher handles the git lifecycle end to end: staging changes, running pre-commit checks, writing commit messages that follow local conventions (emojis included, when the convention calls for them), and opening the pull request. Minor review feedback gets applied automatically before it ever reaches me.

The Gilfoyle reviewer is the one I'm most attached to. Built on the gilfoyle-code-review.instructions.md templates floating around open source, it reviews pull requests with the deadpan, hyper-rational sarcasm of Silicon Valley's Bertram Gilfoyle: security flaws, performance regressions, architectural anti-patterns, all graded by severity. No participation trophies.

The environment underneath

Linux (Ubuntu) is my primary laptop, with Windows 11 and Android in the mix for testing. For desktop packaging I default to Electron when I need solid UI rendering on Linux. Tauri has historically struggled there with certain window managers.

TickTick is still my day-to-day source of truth for tasks, mostly because of how well it syncs with Google Calendar. Linear runs the project management side. Capacities and Daylio handle the permanent memory layer: structured notes and personal logging. I'm also building a COWORK repository that takes Capacities exports and turns them into a portable Obsidian vault, partly for data sovereignty, partly to have a clean migration path whenever the Capacities subscription ends. Claude and OpenCode split that work between them.

For infrastructure, n8n covers visual workflow automation and webhooks, and Railway handles cloud hosting. The codebase runs out of a pnpm monorepo: TypeScript, React, Next.js, Tailwind, Supabase on the backend.

So, was it worth the rebuild?

Yeah. Dropping Copilot as a daily driver wasn't just about budget, it changed how I actually work day to day. A single IDE plugin nudges you toward a passive workflow, the kind where you spend your time auditing autocomplete suggestions instead of making decisions.

The piece that matters most isn't any single tool though, it's plan.md. Forcing the Project Planner to commit every step to that file means the whole stack survives a provider running out of tokens, going down, or changing its pricing overnight without warning. One model burns out, another reads the file, and picks up exactly where it left off.

If you're still stuck auditing autocomplete suggestions one at a time, it might be time to build your own fleet.