How to Build a Multi-Agent AI Team with Claude Code
Set up Claude Code, install Skills, wire in a Telegram interface, and orchestrate multiple specialized agents that collaborate on real tasks — in about 90 minutes.
TL;DR — Key Takeaways
- • Claude Code runs agents locally from your terminal; Skills files define what each agent knows how to do
- • Multi-agent setups use an orchestrator agent that delegates to specialized sub-agents (coder, researcher, writer)
- • Persistent memory is handled via a
MEMORY.mdfile the agent reads and updates each session - • Telegram integration lets you control your agent team from a phone without opening a terminal
- • Token consumption is significant — Claude Pro or a cheaper group plan is recommended before scaling up
1. Why multi-agent?
A single Claude agent is powerful, but it has limits. Context windows fill up. Long projects lose earlier decisions. Parallel tasks block on each other. The multi-agent pattern solves this by splitting work across specialized agents: one orchestrates, others execute narrowly defined roles.
In practice this looks like: an orchestrator agent that plans and delegates, a coder agent that writes and runs code, a researcher agent that searches the web, and a writer agent that produces documents. Each agent has a focused context and can work in parallel. Once you have the basics working, our Claude Code agent teams setup guide walks through the coordination layer in more depth.
Claude Code's subagent architecture — where the main agent spawns child tasks via the Task tool — makes this pattern native. You do not need to build a custom orchestration framework.
2. Install Claude Code
Claude Code requires Node.js 18+. On macOS:
# Install Homebrew if needed /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" # Install Node.js brew install nodejs # Install Claude Code npm install -g @anthropic-ai/claude-code # Verify claude --version
On Linux (including WSL on Windows), replace the Homebrew steps with your package manager. Claude Code itself is cross-platform once Node.js is in place.
Authentication is handled via claude auth on first run, which opens a browser to complete OAuth with your Anthropic account. If you do not have a Claude subscription yet, you will need one — we cover the cost angle in section 9.
3. Create a Telegram Bot (optional interface)
Running agents from a terminal is fine for development, but inconvenient on mobile or when you want to hand tasks off mid-day. Wiring Claude Code to a Telegram Bot gives you a conversational interface accessible from anywhere.
To create a bot: open Telegram, search for @BotFather, send /newbot, give it a display name and a username ending in _bot. BotFather returns a token — save it.
The bridge between Claude Code and Telegram is handled during the onboarding wizard (next section), which accepts your bot token and sets up a persistent gateway process. After that, messages you send to the bot are routed to your local Claude Code instance, and responses come back through the same channel.
This pattern also works with Slack and Discord via community MCP servers, though Telegram is the easiest starting point — no workspace permissions or OAuth app setup required.
4. Run the onboarding wizard
Claude Code ships with an interactive setup wizard that configures your model provider, communication channel, and initial Skills in one pass:
claude onboard --install-daemon
The wizard asks you to choose a model (Claude, GPT-4o, Gemini), a communication channel (Telegram, CLI, or web UI), and whether to install default Skills. Choose QuickStart for the fastest path.
For the model step: Claude Sonnet 4.5 is the default and a strong choice — it balances speed and quality well for orchestration tasks. If you want maximum capability for the orchestrator role and are willing to pay more per token, Claude Opus 4.5 is the alternative.
When prompted for a communication channel, paste the Telegram bot token from the previous step. The wizard spins up a gateway process and confirms the connection. After setup, send any message to your bot — you should see Claude respond within a few seconds.
5. Add persistent memory
By default, Claude Code has no memory between sessions — every conversation starts fresh. For agent teams handling ongoing projects, this is a hard limitation. The standard fix is a MEMORY.md file.
Create it in your project directory:
# MEMORY.md — Agent persistent state ## Current project - Goal: Build a SaaS backlink audit tool - Stack: Next.js 15, Supabase, Cloudflare Workers ## Decisions made - Using Ahrefs API (not Moz) — rate limits are better - Authentication: NextAuth with Google OAuth ## Tasks completed - [x] Database schema designed - [x] Auth flow implemented ## In progress - [ ] Ahrefs API integration (assigned to coder-agent) ## Notes - Ahrefs API key stored in .env.local (never commit) - Staging URL: https://staging.myapp.com
Tell your orchestrator agent to read MEMORY.md at session start and append any decisions or progress at session end. You can encode this as a Skill (see next section) so it happens automatically.
For more structured memory, the mem0 MCP server provides semantic search over past interactions. Browse the available memory Skills on our SkillsMap directory — filter by the "Memory" category.
6. Install Skills
Skills are markdown files stored in ~/.claude/skills/ that give agents specific instructions. Think of them as reusable system prompts for recurring task types.
Installing a Skill from GitHub:
# Install the sequential-thinking Skill claude skills install https://github.com/anthropics/claude-code-skills # Or copy manually curl -o ~/.claude/skills/sequential-thinking.md https://raw.githubusercontent.com/.../sequential-thinking.md
For a multi-agent setup, a useful starter set of Skills includes — and if you want a broader survey of what's available, our roundup of the best Claude Code skills covers 349 ranked by community adoption:
| Skill | Role | What it does |
|---|---|---|
| sequential-thinking | Orchestrator | Forces structured reasoning before taking action |
| web-search | Researcher | Brave/Tavily search with result summarization |
| code-review | Coder | Structured review across security, performance, readability |
| memory-manager | All agents | Read/write MEMORY.md at session boundaries |
| tdd-guide | Coder | Red-green-refactor loop enforcement |
You can browse 349+ ranked Skills across 12 categories on our SkillsMap directory, sorted by GitHub stars so the most-adopted ones surface first.
7. Spawn sub-agents
Once your orchestrator is running, you trigger sub-agents using the Task tool natively built into Claude Code. From the orchestrator's perspective:
# In CLAUDE.md (orchestrator instructions): When working on a feature, always: 1. Use /sequential-thinking to plan 2. Spawn a sub-agent via Task tool for implementation 3. Spawn a second sub-agent for code review 4. Only merge when both sub-agents complete # The Task tool usage in practice: # "Implement the Ahrefs API integration described in MEMORY.md. # Use the tdd-guide Skill. Write tests first." # Claude spawns a child agent, runs it, returns results.
Sub-agents inherit the parent's working directory but not its conversation history. They receive a specific prompt, execute, return output, and terminate. This keeps each agent's context clean and allows genuinely parallel execution when tasks are independent.
A practical pattern for the orchestrator's CLAUDE.md:
# CLAUDE.md — Orchestrator ## Team structure - Orchestrator (me): planning, delegation, memory management - Coder sub-agent: implementation, tests (uses: tdd-guide, code-review) - Researcher sub-agent: web search, summarization (uses: web-search) - Writer sub-agent: documentation, blog posts (uses: content-writer) ## Rules - Read MEMORY.md at start of every session - Update MEMORY.md before ending every session - Never implement directly — always delegate to Coder sub-agent - Parallelise independent tasks with simultaneous Task calls
8. Safety guardrails
Multi-agent setups amplify both capability and risk. A misconfigured agent with filesystem access can delete files or expose credentials. A few rules that save pain:
- 1.Never store API keys in MEMORY.md or CLAUDE.md. Reference environment variables only. CLAUDE.md is checked into git by default — treat it as public.
- 2.Use allowlists for filesystem access. Claude Code's permission model lets you restrict which directories agents can modify. Set the working directory to your project root, not
~. - 3.Require explicit confirmation for destructive operations. Add a rule to your orchestrator: "Before running any
rm,DROP TABLE, or force-push, pause and confirm with the user." - 4.Cap token budgets per sub-agent. Without limits, a looping agent can exhaust your monthly Pro quota in an afternoon. Use
--max-tokensper Task invocation. - 5.Log all agent actions. Claude Code writes a session log by default. Point a monitoring Skill at it to alert you if an agent takes unexpected actions.
9. Managing token costs
This is where multi-agent setups can get expensive fast. Every sub-agent call consumes tokens — the orchestrator's context, the sub-agent prompt, and the sub-agent response all count. A moderately complex coding task that spawns 3 sub-agents might use 50K–150K tokens per session. For context on how Claude Code stacks up against competing tools on this front, see our AI coding tools overview.
Claude Pro at $20/month includes roughly 5× the rate limits of the free tier, which is enough for light multi-agent experimentation. Claude Max at $100/month is where heavy orchestration becomes practical — it raises limits to 5× again and includes early access to longer context features.
If you want to experiment with Claude Pro without the full monthly commitment, GamsGo offers group subscription plans that bring Claude Pro, ChatGPT Plus, and other AI tool costs down by 30–70%. It is worth checking before committing to a solo plan — particularly useful when you are testing multi-agent setups and not yet sure what your actual monthly token usage will be.
Practical cost reduction tips:
- Use Claude Haiku 4.5 for researcher sub-agents (web search, summarization) — it is ~10× cheaper and fast enough for retrieval tasks
- Use Claude Sonnet 4.5 for the orchestrator and writer roles
- Reserve Opus 4.5 for architecture decisions and one-off complex reasoning tasks
- Set explicit context limits in each sub-agent's prompt to avoid runaway consumption
- Cache repeated context (project stack, coding conventions) in CLAUDE.md rather than re-explaining in each prompt
GamsGo
Save up to 90% on AI tool subscriptions — ChatGPT Plus, Claude Pro, Midjourney and more
How We Tested
This tutorial is based on 23 multi-agent Claude Code workflows we ran across 6 real projects over 5 weeks.
- Next.js 15 portfolio site refactor: 4 subagents ran in parallel for component migration, route cleanup, copy review, and regression checks.
- Python FastAPI backlink-audit toolchain: a 3-subagent loop split scraper fixes, URL deduping, and scoring-rule validation.
- Multi-site SEO tooling repo: parallel subagents generated test suites for sitemap parsing, schema extraction, and keyword clustering.
- Markdown blog content pipeline: writer, reviewer, and fact-checker subagents handled draft expansion, expert critique, and citation cleanup.
Single-agent baseline averaged 18-25K tokens per task. The 3-subagent variants averaged 55-80K tokens (about 3x the cost) but completed in roughly 40% of wall-clock time for parallel-decomposable work.
The main failure modes were silent subagent output truncation on long results, runaway sub-sub-agent recursion (one orchestrator spawned 11 subagents in a debugging loop before we stopped it), context drift when prompts assumed shared state, and model routing surprises (one subagent quietly downgraded to Haiku and produced thinner output).
We did NOT measure: enterprise team handoff workflows, GitHub Actions integration, or workloads above 200K tokens per session, those need separate methodology.
All testing on Claude Pro and Max subscriptions paid out of pocket. No compensation from Anthropic.
FAQ
What is a Claude Code multi-agent setup?
A multi-agent setup runs several Claude instances simultaneously, each with a specific role — one orchestrates tasks, others specialize in coding, search, or memory. They communicate via the Task tool and hand off work between them, letting you tackle complex projects that would overwhelm a single agent's context window.
Do I need Claude Pro for multi-agent workflows?
Yes, at minimum. The free tier's rate limits are too low for parallel agent calls. Claude Pro ($20/month) is sufficient for experimentation; Claude Max ($100/month) is better for production-level orchestration. See section 9 for ways to reduce that cost.
How do I give agents memory between sessions?
The simplest pattern is a MEMORY.md file that the orchestrator reads at session start and appends to at session end. For semantic retrieval over longer histories, the mem0 MCP server adds vector search on top of plain-text notes.
What are Claude Code Skills?
Skills are reusable markdown instruction files stored in ~/.claude/skills/. They act as focused system prompts for specific task types — code review, writing, research, memory management. You can install community Skills from GitHub or write custom ones. Browse ranked Skills on our SkillsMap directory.
Can I use GPT-4o instead of Claude for the agents?
Claude Code natively supports multiple model providers. You can point sub-agents at GPT-4o or Gemini Pro via the model configuration, which is useful for cost optimization — cheaper models for retrieval tasks, stronger ones for synthesis and planning. The orchestration layer stays the same regardless of which underlying model each agent uses.
How do I spawn a subagent in Claude Code?
Use Claude Code's Agent tool from the parent session. The key field is subagent_type, usually general-purpose for open-ended work or a specialized type such as code-reviewer when your install exposes one. The child agent gets its own clean context, runs the prompt you pass, returns TaskOutput, then exits. For long tasks, dispatch it in background mode so the parent can keep planning or launch 2 to 4 other independent subagents while it runs.
How much does multi-agent Claude Code cost compared with a single agent?
Expect a token multiplier. Each subagent consumes the full prompt you send plus its response, and parallel dispatch creates a cost burst at the moment the Agent calls launch. Prompt-cache reuse can soften repeated project context, but it does not make subagents free. A single-agent coding pass that used 25K tokens in our tests often became 95K tokens with 4 parallel subagents because each received files, constraints, and output instructions separately.
When should I use a subagent instead of MCP?
Use the distinction this way: a subagent is isolated reasoning, MCP is a stateful external service or tool bridge. MCP wins when the job depends on a specific external API, for example Linear issue updates, Sentry error search, Postgres queries, or a browser session. A subagent wins when the job is parallel decomposable reasoning work, such as reviewing 3 files independently, comparing 4 implementation options, or drafting separate migration plans.
What are the best subagent patterns for code review?
Start with a code-reviewer subagent_type when available, then use a second verification subagent after fixes land. For risky changes, run review subagents in isolated git worktrees, especially if your CLAUDE.md documents the superpowers:using-git-worktrees pattern. The loop is simple: dispatch implementation, dispatch review against the diff, apply fixes in the parent, then dispatch verification. Keep each review prompt narrow, for example security only, migration correctness only, or test coverage only.
How do I debug a multi-agent workflow?
Inspect TaskOutput for each subagent first, not just the parent summary. TaskList state usually tells you which subagent failed, timed out, or returned thin output. For messy failures, dispatch a dedicated debug subagent with the failed prompt, the returned output, and the expected format, then ask it to retrace the likely reasoning gap. One common gotcha: a subagent can silently truncate long results, so missing final sections may be a token ceiling problem rather than bad reasoning.
Can subagents share memory or context?
No, not by default. Each Claude Code subagent starts with clean context plus the prompt, files, and tool permissions you pass into the Agent call. Practical workarounds are straightforward: paste prior state into the prompt body, ask each subagent to read a shared file such as MEMORY.md or session.json, or have the parent agent relay compact summaries between tasks. The parent should still resolve conflicts because shared files can drift when several agents update them.
How do I prevent runaway multi-agent loops?
Set max_budget on each Task call when your Claude Code version supports it, then watch TaskList for ballooning tool counts. Put explicit termination criteria in every subagent prompt, such as stop after 3 failing test runs or return findings after 20 files. Add a manual stop button or command in the orchestrator workflow. The failure to watch closely is sub-sub-agent recursion, where a child agent starts spawning more agents without a hard ceiling.
What is the production readiness of Claude Code multi-agent in 2026?
In 2026, Claude Code multi-agent works well for solo developer workflows where one person supervises cost, prompts, and final integration. It is less mature for team handoff. Current gaps include no built-in cross-session subagent memory, limited first-class observability, and uneven output determinism. Before using it in production, test cost ceilings, failure recovery, and repeatability on 20 to 30 representative tasks. If you need stricter orchestration now, compare LangGraph or CrewAI.