ctaio.dev Ask AI Subscribe free

AI Security / MCP Security

AI Security · Model Context Protocol

MCP Security

Prompt Injection & Tool Risk in the Model Context Protocol

The Model Context Protocol gives an agent a standard way to call tools and reach data. It also gives an attacker a standard way in. MCP does not invent new LLM vulnerabilities — it amplifies the existing ones, because every connected server is a new path for untrusted content and a new holder of credentials. This guide covers the six MCP threat classes, the OAuth authorization model, and the least-privilege defense stack that keeps a useful agent from becoming a confused deputy.

MCP Security: Prompt Injection & Tool Risk in the Model Context Protocol

30-SECOND EXECUTIVE TAKEAWAY

  • Every MCP server is untrusted code with credentials. Vet it, pin it, run it least-privilege, and monitor its egress — the same way you would any third-party dependency with network access.
  • Tool output is an injection vector. Whatever a server returns, the agent reads as input. Treat all tool results as untrusted and constrain what the agent can do with them.
  • Authorization helps; it does not save you. The OAuth model is a real improvement over passthrough tokens, but security comes from scoping credentials per server and gating sensitive actions — not from the spec alone.

Why MCP needs its own threat model

The Model Context Protocol solved a real problem: before it, every agent-to-tool connection was a bespoke integration. A single open standard for exposing tools and data to a model made agents dramatically more capable. The cost of that capability is surface area. Each MCP server an agent connects to is a new place untrusted content can enter, a new component that holds credentials, and a new piece of code running with whatever access you granted it.

None of the individual risks are new. Prompt injection, excessive agency, and improper output handling are already on the OWASP LLM Top 10. What MCP changes is the multiplier: an agent with ten connected servers has ten injection paths, ten credential holders, and ten pieces of third-party code in its trust boundary. The threat model below is those existing risks, instantiated at the tool boundary.

THE SIX MCP THREAT CLASSES

What goes wrong, and how to contain it

Each threat pairs a plain-language description with the practical control. None is exotic; the danger is that MCP makes all six easy to introduce by simply connecting one more server.

T1

Indirect prompt injection via tool output

Attacker instructions embedded in data an MCP server returns (a file, an issue, an email, a web page) are read by the agent as input.

Mitigation: Treat all tool outputs as untrusted. Constrain tool chaining. Require approval before acting on retrieved content. See the prompt injection guide.

T2

Tool-definition poisoning

A malicious server hides instructions in tool descriptions/metadata the model reads to decide how to use a tool.

Mitigation: Vet server source. Review tool definitions. Pin versions so an approved server cannot silently redefine its tools (the "rug pull").

T3

Malicious or compromised MCP server

A server installed from an unverified registry runs as code on your infrastructure with whatever access you granted it.

Mitigation: Allowlist servers. No auto-install from unverified sources. Run with least privilege and network egress controls. Treat as untrusted code.

T4

Token passthrough & credential theft

Broad, long-lived tokens passed straight through a server to downstream APIs become a high-value target on every connected server.

Mitigation: Scope and short-live credentials per server. Use the OAuth authorization model. Never pass a shared broad token through MCP.

T5

Excessive agency / confused deputy

The agent holds many tools’ permissions; an injection through one tool makes it misuse another’s authority.

Mitigation: Default-deny tools. Allowlist per task. Human-in-the-loop on irreversible or cross-boundary actions. Isolate credentials per server.

T6

Command & data injection in server implementations

MCP servers that build shell commands or queries from agent arguments inherit classic injection bugs.

Mitigation: Validate and parameterize all inputs in server code. Never build shell/SQL from raw model arguments. Apply standard appsec to servers.

The authorization model, and its limits

Early MCP deployments were notorious for running with no authentication, or with one broad, long-lived token passed straight through a server to the downstream API. That passthrough pattern is the worst of both worlds: every connected server becomes a holder of a high-value credential, and a single compromised server hands an attacker the keys to the system behind it. Several disclosed MCP exposures trace to exactly this.

The protocol has since standardized on an OAuth-based authorization model, which is the right direction: scoped, revocable, per-client access instead of shared secrets. But authorization in the spec is not security in your deployment. A token that is technically OAuth but still broadly scoped and long-lived buys little. The work is scoping each server to the minimum it needs, keeping credentials short-lived, and isolating them so one server’s compromise does not become every server’s compromise.

FOR YOUR ROLE

What to do this quarter

For the technical CTO

Make MCP servers a reviewed dependency, not a self-service install. Maintain an allowlist with pinned versions, run each server least-privilege with egress controls, and require an architecture review before any agent gets a new tool. Default-deny tool permissions and require approval on sensitive calls.

For the business CAIO

Fold MCP into the AI risk register as a distinct surface: every connected server is a third party holding credentials and a path for untrusted content. Fund the credential-scoping and logging work before agents touch production data. See the AI risk management guide.

For the CISO

Treat MCP servers as you would any code with network access and credentials: inventory, vulnerability scanning, egress monitoring, and a kill switch. Add tool-call logging to the SIEM and build detection for anomalous tool chains. Establish an incident runbook for a compromised server.

MCP Security: Frequently Asked Questions

What is MCP security?
MCP security is the practice of protecting systems that use the Model Context Protocol — the open standard for connecting AI agents to tools and data sources. It covers the risks that appear once an LLM can call external tools through MCP servers: prompt injection delivered through tool outputs, malicious or compromised MCP servers, tool-definition poisoning, over-broad permissions, and credential or token theft. MCP does not create new categories of LLM risk so much as it amplifies existing ones, because every connected server is a new path for untrusted content and a new holder of credentials.
Can MCP be used for prompt injection?
Yes, and it is the central MCP risk. Because an agent reads the outputs of the tools it calls, any MCP server — or any data that server returns — can carry attacker instructions the model then treats as input. This is indirect prompt injection with a wider surface: a poisoned document in a connected file store, a crafted issue in a connected tracker, or a malicious tool description can all inject instructions. The agent cannot reliably tell a legitimate tool result from an attack embedded inside one.
What is tool poisoning in MCP?
Tool poisoning is when a malicious MCP server embeds hidden instructions in its tool descriptions or metadata — the text the model reads to decide how and when to use a tool. Because the model ingests those descriptions as trusted context, a server can use them to manipulate the agent: exfiltrate data, call other tools, or override the user’s intent. A related attack is the "rug pull," where a server presents benign tool definitions during review and silently changes them after it has been approved and connected.
How do I secure an MCP deployment?
Treat every MCP server as untrusted code with network access. Vet and pin servers (no auto-install from unverified registries); run them with least privilege and network egress controls; scope and short-live the credentials each server holds rather than passing through broad tokens; require explicit human approval for sensitive tool calls; and log every tool invocation with its arguments and result. On the model side, treat all tool outputs as untrusted input and constrain which tools an agent may chain. The controls are the same defense-in-depth used for prompt injection, applied at the tool boundary.
Is MCP authentication secure by default?
Early MCP deployments frequently ran with no authentication or with broad, long-lived tokens passed straight through to downstream services — the source of several disclosed exposures. The protocol has since standardized on an OAuth-based authorization model, which is a meaningful improvement, but it is not automatic: a deployment is only as secure as how it scopes tokens, isolates servers, and constrains permissions. Authorization support in the spec reduces risk; it does not remove the need to design least-privilege access per server.
What is the "confused deputy" problem in MCP?
A confused deputy is a privileged component tricked into misusing its authority on behalf of an attacker. In MCP, the agent is the deputy: it holds credentials and can call powerful tools, so an attacker who injects instructions through one tool’s output can make the agent use its other tools’ permissions to act against the user. The defense is to constrain each tool’s authority, isolate credentials per server, and require confirmation before the agent crosses a privilege boundary.
·
Thomas Prommer
Thomas Prommer Technology Executive — CTO/CIO/CTAIO

These salary reports are built on firsthand hiring experience across 20+ years of engineering leadership (adidas, $9B platform, 500+ engineers) and a proprietary network of 200+ executive recruiters and headhunters who share placement data with us directly. As a top-1% expert on institutional investor networks, I've conducted 200+ technical due diligence consultations for PE/VC firms including Blackstone, Bain Capital, and Berenberg — work that requires current, accurate compensation benchmarks across every seniority level. Our team cross-references recruiter data with BLS statistics, job board salary disclosures, and executive compensation surveys to produce ranges you can actually negotiate with.

Continue the AI security cluster

MCP widens the surface; the rest of the cluster covers how to defend it.