ctaio.dev Ask AI Subscribe free

AI Team Design

AI Team Topology: Where LLM Engineers Actually Sit

The Org Design That Ships AI

Team Topologies (Skelton & Pais, 2019) gave us four fundamental team types: stream-aligned, enabling, complicated-subsystem, and platform. The framework reshaped how engineering organizations think about cognitive load and team interaction. But where does the AI team fit? None of them. Or all of them, depending on your maturity stage.

Most organizations get this wrong. They either centralize AI into a bottleneck, scatter it across product teams without shared infrastructure, or build a platform nobody uses. Conway's Law punishes every one of these mistakes by encoding bad org design directly into the AI system architecture.

AI Team Topology: organizational structure for AI engineering teams

Team Topologies wasn't designed for AI

The original Team Topologies framework assumes teams build and own discrete services. The interaction modes (collaboration, X-as-a-Service, facilitating) assume bounded, predictable interfaces between teams. AI breaks these assumptions.

First, AI teams produce probabilistic systems. A platform team shipping a CI/CD pipeline can define a stable contract: push code, get a build artifact. An ML platform team shipping a model serving layer cannot guarantee deterministic behavior. The "contract" between the platform and its consumers includes model performance, latency budgets, drift detection, and evaluation criteria that change with every retraining cycle. This is fundamentally different from traditional platform engineering.

Second, AI work spans the entire stack. A single recommendation feature touches data engineering, model training, model serving, API design, frontend integration, and A/B testing infrastructure. No single Team Topologies type cleanly owns all of that. Stream-aligned teams lack ML infrastructure knowledge. Platform teams lack product context. Enabling teams lack the sustained ownership to see a model through to production.

Third, the talent market forces premature decisions. ML engineers and AI engineers are expensive and scarce. Organizations hire one or two, put them somewhere in the org chart, and that initial placement calcifies into a topology that persists long after it stops making sense. The first hire's manager becomes the "AI team lead" by default, not by design.

The result is topology drift: the AI team's formal position in the org chart diverges from the actual work being done. Shadow processes appear. Infrastructure gets duplicated. The classic "we need to talk to the AI team" bottleneck slows down every product team simultaneously.

Four models for AI team placement

Every organization deploying AI lands on one of these four topologies, whether they chose it deliberately or drifted into it. Each has a natural habitat and a failure mode that triggers the reorg.

01

Centralized AI Team

The CoE Model

One team owns all AI work for the entire company. Product teams submit requests, the AI team prioritizes and delivers. The team typically reports to a VP of Engineering or CTO, sits adjacent to data engineering, and operates its own backlog independent of product sprints.

When it works: Early maturity (0-2 models in production). The company is still figuring out what AI can do for them. Demand is low enough that one team can serve it without creating a months-long queue. The initial models are experimental, and centralizing expertise accelerates learning. Startups under 30 engineers and enterprises in the first year of an AI initiative fit here.

When it breaks: The moment demand outpaces bandwidth. Product teams start waiting weeks for the AI team's attention. Priority conflicts become political. The centralized team becomes a gatekeeper rather than an enabler. Two failure modes: (a) the team becomes so specialized in one domain that it can't context-switch to another, or (b) it spreads so thin across domains that quality drops everywhere.

Team Topologies mapping: The centralized AI team functions as an enabling team at best, a complicated-subsystem team at worst. Enabling when it's actively transferring knowledge to product teams and building toward its own dissolution. Complicated-subsystem when it's hoarding expertise and creating permanent dependencies.

Real example: A mid-market SaaS company (200 engineers, $50M ARR) stands up a 4-person ML team to ship their first recommendation engine. The team succeeds because they have one customer (the product team building recommendations) and one model to ship. Eighteen months later, five product teams want AI features, the ML team is a 6-month bottleneck, and the reorg begins.

02

Embedded AI Engineers

The Feature Team Model

AI engineers sit directly inside product teams. They report to the product team's engineering manager, attend the same stand-ups, and ship AI features alongside frontend and backend engineers. There is no centralized AI team; each product team owns its own AI capabilities end-to-end.

When it works: When AI is a feature, not the product. When the AI work is primarily integration (calling LLM APIs, building RAG pipelines, tuning prompts) rather than model development. When each product team's AI needs are distinct enough that shared infrastructure would be premature abstraction. AI-augmented products (vs. AI-native products) fit this model well.

When it breaks: Infrastructure duplication. Team A builds a vector database integration, Team B builds a different one. Team C builds a prompt management system, Team D rolls their own. Within a year you have four RAG architectures, three evaluation frameworks, and zero shared model registry. Inconsistency in AI behavior across the product surface becomes a UX problem. Embedded engineers get isolated from ML peers and stop developing their craft.

Team Topologies mapping: AI engineers are simply part of stream-aligned teams. This is the purest Team Topologies implementation, but it only works when the AI work is bounded enough to fit inside a single team's cognitive load budget without requiring deep ML platform knowledge.

Real example: A product-led growth company (80 engineers) where three product teams each have one AI engineer building LLM-powered features (smart search, content generation, customer support chatbot). Works fine for 18 months. Then the CEO asks "why do our three AI features behave so differently?" and "why are we paying three separate vector DB bills?" and the reorg conversation starts.

03

Platform AI Team

The ML Platform Model

The AI team builds and operates shared ML infrastructure: model registries, feature stores, experiment tracking, model serving, evaluation pipelines, prompt management, vector databases, and guardrails. Product teams consume this platform to build their own AI features without needing deep ML ops expertise. The platform team ships tooling, not models.

When it works: At scale (10+ models in production, 5+ teams building AI features). When the infrastructure duplication from the embedded model has become too expensive. When you have enough internal demand to justify a dedicated platform investment. When the platform team has clear users and can define X-as-a-Service contracts with measurable SLOs.

When it breaks: When the platform gets too far from product reality. The classic failure: the platform team builds what they think product teams need (usually more abstraction, more configurability) while product teams need something simpler (a working example, good defaults, fast iteration cycles). The platform becomes overengineered for the sophistication of its users. Second failure mode: the platform team has no embedded context, so they can't help when things go wrong in production. Product teams still need "someone who knows ML" on the team, bringing you back to the hybrid model.

Team Topologies mapping: A clean platform team in the Team Topologies sense. Operates X-as-a-Service with stream-aligned teams as consumers. The interaction mode is clearly defined: product teams call the platform's APIs, the platform team maintains the infrastructure. This is the most natural Team Topologies fit of the four models.

Real example: Spotify's ML platform team. Uber's Michelangelo. Any company that has enough models in production that the build/serve/monitor loop needs to be a shared, maintained service rather than copy-pasted boilerplate in each team's repo. At this scale, the platform team is 8-15 engineers and product teams have 0-2 AI engineers each who focus on application logic, not infrastructure.

04

Hybrid Hub-and-Spoke

The Consensus Model

A central AI platform team provides shared infrastructure and best practices. Embedded AI specialists sit inside product teams and consume the platform while maintaining product context. An enabling function (sometimes called "AI guild" or "ML community of practice") connects the embedded specialists so they share learnings and prevent drift. Three layers: platform (the hub), embedded specialists (the spokes), and the connecting tissue (the guild).

When it works: For organizations with 50+ engineers and 3+ teams building AI features. This is the consensus model for most companies past the experimentation phase. It solves the three failure modes simultaneously: no bottleneck (product teams have their own AI people), no duplication (shared platform handles infrastructure), no isolation (guild connects embedded specialists). It maps directly to Team Topologies: a platform team, stream-aligned teams with embedded AI engineers, and a thin enabling team that facilitates knowledge sharing.

When it breaks: Coordination overhead. Three layers means three places where misalignment can hide. The platform team builds something the spokes don't use. The spokes go rogue and build custom solutions because the platform is "too slow." The guild becomes a talking shop without decision-making authority. The tax on this model is continuous alignment work: the Head of AI (or equivalent) spends 40% of their time on internal coordination rather than technical work.

Team Topologies mapping: This IS Team Topologies applied correctly. Platform team (AI infrastructure) + stream-aligned teams (with embedded AI specialists) + enabling team (the guild/community of practice). The interaction modes are: X-as-a-Service between platform and stream-aligned teams, facilitating between the enabling team and everyone else. Collaboration mode activates temporarily when a new capability is being built (platform collaborates with the first product team to use it, then shifts to X-as-a-Service for subsequent teams).

Real example: A Series D fintech (300 engineers, 40 in AI/ML across 8 product teams). Central ML platform team of 12 handles model serving, feature store, experiment tracking, and evaluation infrastructure. Each product team has 2-4 AI engineers who build models and RAG pipelines on the platform. Bi-weekly ML guild meetings share learnings. A Head of AI (reporting to CTO) owns the platform team directly and has dotted-line influence over embedded AI hires.

Maturity-based topology selection

Pick based on where you are today, not where you want to be in three years. Premature platformification is as dangerous as staying centralized too long.

Maturity Stage Models in Prod Recommended Topology Why
Experimentation 0-2 Centralized Concentrate scarce expertise. Learn what works before distributing. The team's job is to prove AI value, not scale it.
Early Production 3-5 Embedded or early Hub-and-Spoke Demand now exceeds centralized capacity. Either embed engineers into the teams with highest AI leverage, or begin splitting into platform + embedded.
Scaling 6-10 Hub-and-Spoke Infrastructure duplication is now expensive. Platform investment has clear ROI. Enough teams are using AI that patterns have emerged.
At Scale 10+ Platform + Embedded Specialists Full platform with well-defined contracts. Embedded specialists are experienced enough to operate independently. Guild keeps alignment.
AI-Native AI is the product Stream-aligned AI teams from day one When AI is the core product (not a feature), AI engineers ARE the product engineers. No need for a separate "AI team" topology. Build a platform when infrastructure needs warrant it, same as any other engineering platform decision.

The reorg trigger: You need to change topologies when the queue for AI team time exceeds 4-6 weeks, when product teams start building shadow AI infrastructure, when model quality varies wildly across teams, or when your best AI engineers quit because they feel isolated from peers. Any one of these signals means your current topology has outlived its usefulness.

The roles that actually exist in 2026

Job titles in AI are still chaotic. The same person might be called "ML Engineer" at one company and "AI Engineer" at another while doing completely different work. The distinctions matter for org design because each role has different infrastructure needs, team placement, and career ladders.

ML Engineer

Focus: Model development

Trains models from scratch or fine-tunes foundation models. Works with training pipelines, feature engineering, hyperparameter optimization, and offline evaluation. Needs GPU clusters, experiment tracking (Weights & Biases, MLflow), and large datasets. Reports to: ML team lead or Head of AI. Typically sits in the platform or centralized team.

AI Engineer

Focus: Model integration

Integrates pre-trained models (especially LLMs) into production systems. Builds RAG architectures, agent frameworks, prompt pipelines, and evaluation harnesses. Needs API access, orchestration frameworks (LangChain, LlamaIndex, custom), and production deployment tooling. Reports to: product team EM or AI team lead. Typically embedded in product teams.

ML Platform Engineer

Focus: ML infrastructure

Builds and operates the shared ML platform: model serving (Triton, vLLM, TGI), feature stores, model registries, training orchestration, monitoring, and cost management. Pure infrastructure role with ML domain knowledge. Reports to: platform team lead. Lives in the platform team. Does not build models.

AI Product Manager

Focus: AI product strategy

Different from regular PMs because AI products have non-deterministic behavior, require different evaluation methods (not just A/B tests), and need ongoing monitoring post-launch. Must understand model capabilities and limitations well enough to set realistic expectations with stakeholders. Partners with AI engineers on evaluation criteria and failure-mode documentation.

Prompt Engineer

Focus: LLM behavior design

A contested role. In 2024, companies hired dedicated prompt engineers. By 2026, most organizations treat prompting as a skill that AI engineers own rather than a standalone position. Where it persists as a role: companies with dozens of LLM-powered features that need systematic prompt management, versioning, and A/B testing across a shared prompt registry. Otherwise, it is folded into the AI engineer scope.

AI Safety / Red Team

Focus: Model security and alignment

Tests AI systems for failure modes: prompt injection, harmful outputs, bias, data leakage, jailbreaks. Reports to: CISO (if security-first), Head of AI (if product-first), or operates as an independent function with a direct line to leadership. Should NOT report to the same team that built the system being tested. See our AI red teaming guide for implementation details.

The distinction that drives org design: ML Engineers and ML Platform Engineers belong in centralized or platform teams because their work is horizontal (serving multiple product surfaces). AI Engineers belong in product teams because their work is vertical (shipping one feature end-to-end). AI Product Managers go wherever the AI engineers are. Red Team is independent by definition. Getting this wrong means you either starve product teams of the people they need (by centralizing AI engineers who should be embedded) or you fragment infrastructure expertise (by embedding platform engineers who should be centralized).

Your AI architecture already mirrors your org chart

Melvin Conway observed in 1967 that organizations design systems mirroring their communication structures. In AI, this law is especially punishing because the feedback loop stays invisible until production.

If the AI team is isolated from product: your AI features will feel bolted on. The model will be technically impressive but poorly integrated into the user workflow. The handoff between "model output" and "product experience" will be a JSON blob thrown over a wall.

If the AI team is fragmented across product teams: your models won't share infrastructure. You'll have three different vector databases, two incompatible evaluation frameworks, and zero institutional knowledge about what works. Each team reinvents lessons the other already learned.

If the AI team reports exclusively to engineering: your AI features will be technically excellent but product-blind. They'll optimize for model performance metrics (F1 score, latency) rather than user outcomes (task completion, time saved). The system will be over-architected for problems users don't have.

If the AI team reports exclusively to product: your AI infrastructure will be underinvested. Each feature ships with custom glue code, no shared evaluation pipeline, no model monitoring, no cost tracking. The "AI debt" compounds until a major incident forces a platform investment.

The topology you choose is the architecture you'll get. Pick the one that produces the architecture you want, not the one that's easiest to hire into.

Read the full analysis: Conway's Law in the Age of AI Teams

The 90-day topology transition

Reorgs fail when they're announced as big-bang changes. Topology transitions work when they're executed as a series of small, reversible moves with clear success criteria at each step.

Week 1-2: Audit the current state. Map where every AI engineer actually spends their time (not their job description, their calendar). Identify the shadow processes: who do product teams actually go to when they need AI help? Where is infrastructure being duplicated? Where are people waiting in a queue?

Week 3-4: Define the target topology. Use the maturity framework above. Name the teams, the reporting lines, and the interaction modes. Write down what "success" looks like in 90 days (measurable: queue time, model deployment frequency, infrastructure consolidation).

Week 5-8: Execute the transition. Move one team at a time. Start with the team that has the clearest value case for the new topology. Don't move everyone simultaneously. Each move creates a proof point that makes the next move easier to justify.

Week 9-12: Stabilize and measure. Are queue times shorter? Is infrastructure consolidation happening? Are embedded engineers still connected to their peers? Is the platform team shipping things product teams actually use? Adjust based on data, not opinions.

The hard part: Reporting line changes mean someone loses headcount. The VP who had 8 AI engineers on their team now has 3 (because 5 moved to the platform team). This is a political problem, not a technical one, and it requires executive sponsorship. If the CTO isn't willing to own the reorg, it won't stick.

Five AI team anti-patterns I've seen repeatedly

1. The "AI Center of Bottleneck." A centralized team that takes requests from everyone, prioritizes by squeakiest wheel, and delivers 3-6 months after the business needed the feature. Product teams route around it by hiring contractors or using no-code AI tools. The centralized team becomes irrelevant to the actual AI work happening in the company.

2. The "Lone Wolf" embedded engineer. One AI engineer in a product team of 12 backend/frontend engineers. They have no ML peers to review their work, no infrastructure support, and slowly become a full-stack engineer who occasionally does AI work. Their prompts are unversioned, their evaluation is manual, and when they leave, the AI feature breaks within two months.

3. The "Platform for Nobody." A platform team that builds elaborate ML infrastructure (Kubernetes operators, custom feature stores, bespoke model registries) that no product team is sophisticated enough to use. The platform team's roadmap is disconnected from actual product needs. Six months in, product teams are still deploying models via Jupyter notebooks and Docker containers because the platform's learning curve is too steep.

4. The "Research Lab." An AI team optimizing for paper-publishable results rather than production impact. They build state-of-the-art models that never ship because they require infrastructure that doesn't exist, data pipelines that aren't built, or latency budgets that aren't achievable. The team is evaluated on novelty, not production metrics.

5. The "Stealth AI Team." A product team that quietly accumulates AI capabilities without organizational awareness. They build an LLM-powered feature, then another, then another. By the time leadership notices, they have 5 models in production with no monitoring, no cost tracking, no security review, and no documentation. A prompt injection incident or a surprise $50K GPU bill triggers a panicked reorg.

AI Team Topology: Frequently Asked Questions

What type of team should AI/ML be in Team Topologies?
It depends on maturity. At early stages (0-2 models in production), the AI team functions as an enabling team that upskills stream-aligned product teams. At scale (10+ models), it splits into a platform team (providing shared ML infrastructure) and embedded enabling specialists. The mistake most organizations make is treating AI as a complicated-subsystem team from day one, which isolates it from product reality and creates a bottleneck.
Should the AI team report to the CTO or the CPO?
If AI is infrastructure (model serving, feature stores, vector databases), it reports to the CTO alongside platform engineering. If AI is product differentiation (recommendation engines, generative features, autonomous agents), it reports to the CPO through a Head of AI Product. Most Series C+ companies end up with a dotted-line structure: engineering reports to CTO, product priorities come from CPO. The reporting line determines what gets built first, so choose based on where AI creates the most value for your business.
How big should an AI team be for a Series B startup?
Three to five engineers is the sweet spot for a Series B with one or two AI-powered product features. The minimum viable team is: one senior ML engineer who owns model development, one ML platform/ops engineer who owns serving and monitoring, and one AI-focused product manager who owns prioritization and evaluation. Add one or two more engineers when you hit the "second model" threshold. Do not hire ahead of demand; underutilized ML engineers atrophy fast.
What's the difference between an ML engineer and an AI engineer?
ML engineers train and optimize models from scratch or fine-tune foundation models. They work with training pipelines, feature engineering, hyperparameter tuning, and model evaluation. AI engineers integrate pre-trained models (especially LLMs) into production systems. They work with prompt engineering, RAG architectures, agent frameworks, API orchestration, and evaluation harnesses. The distinction matters for org design: ML engineers need GPU clusters and experiment tracking; AI engineers need API access and production deployment pipelines.
When should you hire a Head of AI vs embed engineers in product teams?
Hire a Head of AI when you have three or more AI-powered features shipping or planned, when model quality directly impacts revenue, or when you need organizational coherence across multiple teams using AI differently. Embed engineers without a Head of AI when you have one product surface using AI, when the work is mostly integration (calling APIs, building RAG) rather than model development, or when the AI work is owned by a single product manager already.
Is the AI Center of Excellence model dead?
Not dead, but evolving. The pure CoE model (one central team does all AI work for the company) breaks at scale because it becomes a bottleneck: every product team queues for CoE time. The pattern that works in 2026 is a transitional CoE that deliberately works itself out of a job. It starts centralized, builds the platform and practices, trains product teams, then dissolves into a platform team plus embedded specialists. The CoE that stays centralized forever becomes the "ivory tower" that product teams route around.
·
Thomas Prommer
Thomas Prommer Technology Executive — CTO/CIO/CTAIO

These salary reports are built on firsthand hiring experience across 20+ years of engineering leadership (adidas, $9B platform, 500+ engineers) and a proprietary network of 200+ executive recruiters and headhunters who share placement data with us directly. As a top-1% expert on institutional investor networks, I've conducted 200+ technical due diligence consultations for PE/VC firms including Blackstone, Bain Capital, and Berenberg — work that requires current, accurate compensation benchmarks across every seniority level. Our team cross-references recruiter data with BLS statistics, job board salary disclosures, and executive compensation surveys to produce ranges you can actually negotiate with.

AI org design in your inbox

Team topology decisions, hiring frameworks, and field-tested AI leadership patterns. Written by a CTO who has restructured AI teams, not an analyst describing them.