The Three Topologies
Multi-agent systems are usually described by what they can do. You learn more by asking how they break. I built the same task (a structured research pipeline) in three different architectures and deliberately ran each one into its breaking point.
Topology 1: The Monolith
One agent, all tools, sequential execution. The agent has access to a web search tool, a structured extraction tool, and a formatting tool. It decides when to call each one through its own reasoning loop.
Where it works: Tasks with fewer than ~8 sequential steps where the same "personality" (system prompt) fits every step. Debugging is transparent. Every step shows up in one trace you can read top to bottom.
Where it breaks: Context window saturation after ~6 complex steps. Prompt interference when early instructions conflict with late-stage task requirements. A "write a thorough research report" instruction and a "be concise and punchy" instruction will fight each other once the context fills up.
Topology 2: The Handoff Chain
Specialized agents hand off to the next in sequence. Agent A (Researcher) collects raw sources and passes a structured payload to Agent B (Analyst), who extracts claims and hands to Agent C (Writer), who formats the final output.
Where it works: Tasks with genuinely incompatible system prompts across stages. Long pipelines where a monolith\'s context would overflow. When different stages want different models (a cheap model for extraction, an expensive one for synthesis).
Where it breaks: At the handoff boundary. Every transition is a trust boundary. Agent B receives Agent A\'s output and has to interpret it correctly without being able to ask a clarifying question. Output format drift across runs produces silent downstream failures that look like Agent B\'s problem but are actually Agent A\'s. This is the hardest failure mode to diagnose in production.
Topology 3: The Swarm
A coordinator agent decomposes the task and dispatches sub-tasks to worker agents running in parallel. Workers complete independently and return results to the coordinator for aggregation.
Where it works: Tasks with genuinely parallel sub-problems, like researching 5 independent topics in one go. Latency-sensitive pipelines where wall-clock time matters more than total token cost.
Where it breaks: Hidden sequential dependencies (more common than people expect). Output format conflicts at aggregation. The coordinator agent turning into a reasoning bottleneck as task complexity grows.
Which Topology to Choose
The answer comes down to two variables: task complexity (how many distinct steps with conflicting requirements) and parallelism opportunity (how much of the work is genuinely independent).
| Situation | Recommended topology |
|---|---|
| Under 6 steps, same reasoning style throughout | Monolith |
| Steps require incompatible system prompts | Handoff |
| Tasks are genuinely independent and latency matters | Swarm |
| Uncertain — start here | Monolith, add complexity when you hit a wall |
The Question Nobody Asks
Here is the question that almost never comes up in topology discussions: does the task itself actually need more than one agent? Most teams skip past that question and go straight to "which pattern should we use?" The honest answer starts one step earlier.
What I see in practice is teams picking a topology based on what their chosen orchestration framework makes easy. CrewAI makes a handoff chain feel natural because the Crew/Agent/Task abstraction is right there. LangGraph makes a coordinator-plus-workers pattern feel natural because the graph primitives cleanly support it. Swarm makes handoffs cheap because handoffs are the only primitive it has. So the framework quietly shapes the architecture. The architecture shapes the failure modes. The failure modes shape your on-call rotation. Six months later you\'re debugging a swarm that should have been a monolith, and the original reason was "CrewAI felt easy on Tuesday."
The trap to avoid: picking the topology your tooling makes convenient rather than the one your task actually requires. If your task is three sequential steps with the same system prompt, that\'s a monolith. It doesn\'t matter how elegant the handoff pattern looks in a CrewAI tutorial. Framework ergonomics are a poor proxy for architectural fit. Write the task decomposition on paper before you open a framework. Then pick.
One tax worth pricing before you commit to a swarm or handoff pattern: token cost. Every handoff adds context window overhead because the receiving agent needs enough of the prior state to do its job. A 5-agent swarm on a non-trivial task can easily 3x your token spend versus a monolith running the same work with a single context. For production systems that hit LLM APIs thousands of times per day, "parallel chaos" is also "parallel invoice."
FAQ
What is the monolith topology in multi-agent systems?
A monolith topology uses a single orchestrating agent that runs all tasks sequentially or manages all tools directly. No sub-agents, no handoffs. The word "monolith" carries baggage in software engineering, but for agents it's usually the right call. Lowest failure surface, highest debuggability. For most use cases that claim to need multi-agent systems, a well-designed monolith with parallel tool calls is the right answer.
What is the handoff topology and when should I use it?
Handoff topology passes control and context from one specialized agent to the next. Agent A does task A, packages its output, and hands off to Agent B. This works well when tasks need distinctly different specializations that conflict (a research agent and an editing agent with different prompting strategies). The failure mode: handoffs accumulate context loss at each boundary. If Agent A's output format doesn't match Agent B's expectations exactly, you get a failure that looks like Agent B's problem but is actually Agent A's output quality. Use handoff when specialization genuinely requires incompatible prompting, not just different tasks.
What is the swarm topology?
Swarm topology runs multiple agents in parallel on independent sub-tasks, then aggregates the results. The appeal is throughput. A 10-task pipeline that takes 10 minutes sequentially can take 1 minute with 10 parallel agents. The catch: task independence is harder to guarantee than it looks. If Agent 3 needs Agent 1's output to do its job, your swarm has a hidden sequential dependency and the parallelism is fake. Swarms also need more careful state management. You need somewhere to collect and merge partial outputs.
Which topology should I start with?
Start with the monolith. Add complexity only when you hit a concrete wall: the monolith prompt is too long for reliable instruction following, tasks genuinely need incompatible system prompts, or you have proven latency requirements that only parallel execution can meet. Most teams reach for multi-agent complexity long before they've exhausted what a well-designed single agent can do.
What are the most common production failure modes for each topology?
Monolith: context window overflow when tasks accumulate history; prompt interference between early and late instructions. Handoff: context loss at boundaries; the receiving agent ignores a malformed handoff and hallucinates instead of erroring. Swarm: phantom parallelism (dependencies force sequential execution anyway); output aggregation failures when formats don't align; the coordinator agent turning into a bottleneck. Full failure catalog publishes with the Season 2 podcast.