CTAIO Labs Ask AI Subscribe free
CTAIO Labs
Season 2 podcast CTAIO Labs · S02E02

Monolith, Handoff, or Swarm? Three Agentic Topologies in Production

Architecture patterns for multi-agent systems. When each topology works, where each breaks under load, and the failure modes from building all three.

Season 2 · In Progress
All three topologies have been built and run into their failure modes. The full production failure catalog ships with the Season 2 podcast audio. Subscribe below if you want the breakdown.

Key Takeaways

  • You pick a topology the same way you pick a pager rotation: based on what you can debug at 3am. — The right topology depends on your failure tolerance and your debugging budget, not the number of tasks on paper. A monolith that breaks in a single visible trace beats a swarm that fails silently across three workers.
  • Handoff topologies break at the handoff. Almost every time. — The boundary between agents is the highest-risk point in the whole system. If the receiving agent doesn't validate what it was passed, you get downstream failure with upstream blame. The transition contract matters more than the agents themselves.
  • A swarm without a coordinator is just parallel chaos. — Pure swarm topologies devolve quickly under ambiguous tasks. The parallelism benefit only shows up when agents can genuinely work independently. That's rarer than the pattern implies — most "parallel" tasks have hidden sequential dependencies.

The Three Topologies

Multi-agent systems are usually described by what they can do. You learn more by asking how they break. I built the same task (a structured research pipeline) in three different architectures and deliberately ran each one into its breaking point.

Topology 1: The Monolith

One agent, all tools, sequential execution. The agent has access to a web search tool, a structured extraction tool, and a formatting tool. It decides when to call each one through its own reasoning loop.

Orchestrator Agent
→ search_web()
→ extract_claims()
→ format_report()

Where it works: Tasks with fewer than ~8 sequential steps where the same "personality" (system prompt) fits every step. Debugging is transparent. Every step shows up in one trace you can read top to bottom.

Where it breaks: Context window saturation after ~6 complex steps. Prompt interference when early instructions conflict with late-stage task requirements. A "write a thorough research report" instruction and a "be concise and punchy" instruction will fight each other once the context fills up.

Topology 2: The Handoff Chain

Specialized agents hand off to the next in sequence. Agent A (Researcher) collects raw sources and passes a structured payload to Agent B (Analyst), who extracts claims and hands to Agent C (Writer), who formats the final output.

Researcher Agent
→ handoff(sources)
Analyst Agent
→ handoff(claims)
Writer Agent

Where it works: Tasks with genuinely incompatible system prompts across stages. Long pipelines where a monolith\'s context would overflow. When different stages want different models (a cheap model for extraction, an expensive one for synthesis).

Where it breaks: At the handoff boundary. Every transition is a trust boundary. Agent B receives Agent A\'s output and has to interpret it correctly without being able to ask a clarifying question. Output format drift across runs produces silent downstream failures that look like Agent B\'s problem but are actually Agent A\'s. This is the hardest failure mode to diagnose in production.

Topology 3: The Swarm

A coordinator agent decomposes the task and dispatches sub-tasks to worker agents running in parallel. Workers complete independently and return results to the coordinator for aggregation.

Coordinator Agent
Worker A
Worker B
Worker C
Aggregator

Where it works: Tasks with genuinely parallel sub-problems, like researching 5 independent topics in one go. Latency-sensitive pipelines where wall-clock time matters more than total token cost.

Where it breaks: Hidden sequential dependencies (more common than people expect). Output format conflicts at aggregation. The coordinator agent turning into a reasoning bottleneck as task complexity grows.

Which Topology to Choose

The answer comes down to two variables: task complexity (how many distinct steps with conflicting requirements) and parallelism opportunity (how much of the work is genuinely independent).

SituationRecommended topology
Under 6 steps, same reasoning style throughoutMonolith
Steps require incompatible system promptsHandoff
Tasks are genuinely independent and latency mattersSwarm
Uncertain — start hereMonolith, add complexity when you hit a wall

The Question Nobody Asks

Here is the question that almost never comes up in topology discussions: does the task itself actually need more than one agent? Most teams skip past that question and go straight to "which pattern should we use?" The honest answer starts one step earlier.

What I see in practice is teams picking a topology based on what their chosen orchestration framework makes easy. CrewAI makes a handoff chain feel natural because the Crew/Agent/Task abstraction is right there. LangGraph makes a coordinator-plus-workers pattern feel natural because the graph primitives cleanly support it. Swarm makes handoffs cheap because handoffs are the only primitive it has. So the framework quietly shapes the architecture. The architecture shapes the failure modes. The failure modes shape your on-call rotation. Six months later you\'re debugging a swarm that should have been a monolith, and the original reason was "CrewAI felt easy on Tuesday."

The trap to avoid: picking the topology your tooling makes convenient rather than the one your task actually requires. If your task is three sequential steps with the same system prompt, that\'s a monolith. It doesn\'t matter how elegant the handoff pattern looks in a CrewAI tutorial. Framework ergonomics are a poor proxy for architectural fit. Write the task decomposition on paper before you open a framework. Then pick.

One tax worth pricing before you commit to a swarm or handoff pattern: token cost. Every handoff adds context window overhead because the receiving agent needs enough of the prior state to do its job. A 5-agent swarm on a non-trivial task can easily 3x your token spend versus a monolith running the same work with a single context. For production systems that hit LLM APIs thousands of times per day, "parallel chaos" is also "parallel invoice."

FAQ

What is the monolith topology in multi-agent systems?

A monolith topology uses a single orchestrating agent that runs all tasks sequentially or manages all tools directly. No sub-agents, no handoffs. The word "monolith" carries baggage in software engineering, but for agents it's usually the right call. Lowest failure surface, highest debuggability. For most use cases that claim to need multi-agent systems, a well-designed monolith with parallel tool calls is the right answer.

What is the handoff topology and when should I use it?

Handoff topology passes control and context from one specialized agent to the next. Agent A does task A, packages its output, and hands off to Agent B. This works well when tasks need distinctly different specializations that conflict (a research agent and an editing agent with different prompting strategies). The failure mode: handoffs accumulate context loss at each boundary. If Agent A's output format doesn't match Agent B's expectations exactly, you get a failure that looks like Agent B's problem but is actually Agent A's output quality. Use handoff when specialization genuinely requires incompatible prompting, not just different tasks.

What is the swarm topology?

Swarm topology runs multiple agents in parallel on independent sub-tasks, then aggregates the results. The appeal is throughput. A 10-task pipeline that takes 10 minutes sequentially can take 1 minute with 10 parallel agents. The catch: task independence is harder to guarantee than it looks. If Agent 3 needs Agent 1's output to do its job, your swarm has a hidden sequential dependency and the parallelism is fake. Swarms also need more careful state management. You need somewhere to collect and merge partial outputs.

Which topology should I start with?

Start with the monolith. Add complexity only when you hit a concrete wall: the monolith prompt is too long for reliable instruction following, tasks genuinely need incompatible system prompts, or you have proven latency requirements that only parallel execution can meet. Most teams reach for multi-agent complexity long before they've exhausted what a well-designed single agent can do.

What are the most common production failure modes for each topology?

Monolith: context window overflow when tasks accumulate history; prompt interference between early and late instructions. Handoff: context loss at boundaries; the receiving agent ignores a malformed handoff and hallucinates instead of erroring. Swarm: phantom parallelism (dependencies force sequential execution anyway); output aggregation failures when formats don't align; the coordinator agent turning into a bottleneck. Full failure catalog publishes with the Season 2 podcast.

Also in Season 2: Agentic Orchestration
S2E1
Framework Comparison — LangGraph vs CrewAI vs 4 Others

Same 3-step agent built in six frameworks. DX, cost, reliability, and debuggability scored.

S2E3
Langfuse vs LangSmith vs Arize Phoenix vs Helicone

Which tool actually shows you what your agent is doing — and which hides the most important signals.

No comments yet. Be the first!

The CTAIO Lab Podcast

Now playing: Building My AI Clone — voice cloning, video avatars, lip sync, and the full production pipeline.

Previously

No previous episodes yet — this is where it all starts.

Up Next

Building My AI Clone · E09

AI video of yourself

Identity is the new perimeter. How zero-trust IAM is becoming the foundation of enterprise security architecture.

OktaAuth0Azure AD / Entra IDAWS IAMOasis Security