AI Security · Model Context Protocol
MCP Security
Prompt Injection & Tool Risk in the Model Context Protocol
The Model Context Protocol gives an agent a standard way to call tools and reach data. It also gives an attacker a standard way in. MCP does not invent new LLM vulnerabilities — it amplifies the existing ones, because every connected server is a new path for untrusted content and a new holder of credentials. This guide covers the six MCP threat classes, the OAuth authorization model, and the least-privilege defense stack that keeps a useful agent from becoming a confused deputy.
30-SECOND EXECUTIVE TAKEAWAY
- Every MCP server is untrusted code with credentials. Vet it, pin it, run it least-privilege, and monitor its egress — the same way you would any third-party dependency with network access.
- Tool output is an injection vector. Whatever a server returns, the agent reads as input. Treat all tool results as untrusted and constrain what the agent can do with them.
- Authorization helps; it does not save you. The OAuth model is a real improvement over passthrough tokens, but security comes from scoping credentials per server and gating sensitive actions — not from the spec alone.
Why MCP needs its own threat model
The Model Context Protocol solved a real problem: before it, every agent-to-tool connection was a bespoke integration. A single open standard for exposing tools and data to a model made agents dramatically more capable. The cost of that capability is surface area. Each MCP server an agent connects to is a new place untrusted content can enter, a new component that holds credentials, and a new piece of code running with whatever access you granted it.
None of the individual risks are new. Prompt injection, excessive agency, and improper output handling are already on the OWASP LLM Top 10. What MCP changes is the multiplier: an agent with ten connected servers has ten injection paths, ten credential holders, and ten pieces of third-party code in its trust boundary. The threat model below is those existing risks, instantiated at the tool boundary.
THE SIX MCP THREAT CLASSES
What goes wrong, and how to contain it
Each threat pairs a plain-language description with the practical control. None is exotic; the danger is that MCP makes all six easy to introduce by simply connecting one more server.
Indirect prompt injection via tool output
Attacker instructions embedded in data an MCP server returns (a file, an issue, an email, a web page) are read by the agent as input.
Mitigation: Treat all tool outputs as untrusted. Constrain tool chaining. Require approval before acting on retrieved content. See the prompt injection guide.
Tool-definition poisoning
A malicious server hides instructions in tool descriptions/metadata the model reads to decide how to use a tool.
Mitigation: Vet server source. Review tool definitions. Pin versions so an approved server cannot silently redefine its tools (the "rug pull").
Malicious or compromised MCP server
A server installed from an unverified registry runs as code on your infrastructure with whatever access you granted it.
Mitigation: Allowlist servers. No auto-install from unverified sources. Run with least privilege and network egress controls. Treat as untrusted code.
Token passthrough & credential theft
Broad, long-lived tokens passed straight through a server to downstream APIs become a high-value target on every connected server.
Mitigation: Scope and short-live credentials per server. Use the OAuth authorization model. Never pass a shared broad token through MCP.
Excessive agency / confused deputy
The agent holds many tools’ permissions; an injection through one tool makes it misuse another’s authority.
Mitigation: Default-deny tools. Allowlist per task. Human-in-the-loop on irreversible or cross-boundary actions. Isolate credentials per server.
Command & data injection in server implementations
MCP servers that build shell commands or queries from agent arguments inherit classic injection bugs.
Mitigation: Validate and parameterize all inputs in server code. Never build shell/SQL from raw model arguments. Apply standard appsec to servers.
The authorization model, and its limits
Early MCP deployments were notorious for running with no authentication, or with one broad, long-lived token passed straight through a server to the downstream API. That passthrough pattern is the worst of both worlds: every connected server becomes a holder of a high-value credential, and a single compromised server hands an attacker the keys to the system behind it. Several disclosed MCP exposures trace to exactly this.
The protocol has since standardized on an OAuth-based authorization model, which is the right direction: scoped, revocable, per-client access instead of shared secrets. But authorization in the spec is not security in your deployment. A token that is technically OAuth but still broadly scoped and long-lived buys little. The work is scoping each server to the minimum it needs, keeping credentials short-lived, and isolating them so one server’s compromise does not become every server’s compromise.
FOR YOUR ROLE
What to do this quarter
For the technical CTO
Make MCP servers a reviewed dependency, not a self-service install. Maintain an allowlist with pinned versions, run each server least-privilege with egress controls, and require an architecture review before any agent gets a new tool. Default-deny tool permissions and require approval on sensitive calls.
For the business CAIO
Fold MCP into the AI risk register as a distinct surface: every connected server is a third party holding credentials and a path for untrusted content. Fund the credential-scoping and logging work before agents touch production data. See the AI risk management guide.
For the CISO
Treat MCP servers as you would any code with network access and credentials: inventory, vulnerability scanning, egress monitoring, and a kill switch. Add tool-call logging to the SIEM and build detection for anomalous tool chains. Establish an incident runbook for a compromised server.
MCP Security: Frequently Asked Questions
What is MCP security?
Can MCP be used for prompt injection?
What is tool poisoning in MCP?
How do I secure an MCP deployment?
Is MCP authentication secure by default?
What is the "confused deputy" problem in MCP?
Continue the AI security cluster
MCP widens the surface; the rest of the cluster covers how to defend it.