Short answer
MCP security best practices are the habits and controls that keep AI agents from trusting the wrong server, reusing the wrong permissions, or obeying the wrong text. In a healthy MCP deployment, the model does not get to treat every server description, resource, prompt, or tool result as equally trustworthy just because it arrived through the protocol.
The cleanest way to think about MCP is in four layers: discovery and supply chain trust, identity and session binding, runtime isolation and permissions, and model-facing untrusted content. If one of those layers is weak, the others have to work much harder.
If you want the narrower attack explainer first, read our guide to MCP prompt injection and tool poisoning. If you want the broader prompt-injection controls that apply outside MCP too, read our risk-reduction guide. This page is the operational checklist for securing real MCP-enabled workflows.
Why MCP security became a major topic in 2025 and 2026
MCP moved fast from developer convenience to enterprise security concern. The official MCP docs now have dedicated security and authorization guidance. OWASP published both client and server MCP guides. Microsoft, Docker, and Palo Alto all published practical operational advice. The research has also moved beyond toy demos into measurement on real ecosystems.
That matters because the conversation is no longer theoretical. Recent work and official guidance all point in the same direction: speaking MCP does not make a server trustworthy, and using OAuth does not automatically make a workflow safe.
- MCP at First Glance analyzed 1,899 open-source MCP servers and found 7.2% with general vulnerabilities plus 5.5% with MCP-specific tool-poisoning findings.
- MCPTox built 1,312 malicious cases across 45 live MCP servers and 353 authentic tools, with the best reported attack success rate reaching 72.8% and refusal still below 3% in their evaluation.
- MCP-ITP showed that implicit tool poisoning can push attack success to 84.2% while driving malicious-tool detection as low as 0.3%.
- Give Them an Inch and They Take a Mile analyzed 6,137 real-world MCP servers and found insecure authorization behavior in 46.4% of them.
If you want the framework view that sits above these mechanics, read our OWASP Top 10 for LLM applications guide. MCP security touches several of those OWASP categories at once: prompt injection, supply chain risk, improper output handling, and excessive agency.
The four-layer MCP security model
The most useful mental model is to stop asking, "How do we secure one tool call?" Ask instead, "Where can trust go wrong before, during, and after the tool call?"
| Layer | What usually breaks | Best practices |
|---|---|---|
| Discovery and supply chain | Typosquatted or malicious servers, rug pulls, quiet metadata drift, weak review of third-party servers | Trusted registries, source verification, version pinning, checksums, staged rollout, approval logs |
| Identity and session | Token passthrough, shared auth state, caller confusion, overly broad scopes, weak redirect handling | OAuth 2.1, PKCE, exact redirect validation, short-lived scopes, per-session binding, no global auth reuse |
| Runtime and permissions | Overprivileged filesystem or network access, unscoped secrets, unsafe local HTTP exposure, runaway actions | Containers or equivalent sandboxing, read-only mounts, outbound limits, quotas, scoped secrets, least privilege |
| Model-facing content | Tool poisoning, prompt injection, hidden instructions in manifests, prompts, resources, or tool output | Treat metadata as untrusted, schema validation, task compartmentalization, confirmation gates, input inspection |
What the recent papers say
The papers do not all study the same failure, which is why they are useful together.
- Breaking the Protocol argues that some MCP weaknesses are architectural, not just bad implementations: capability claims, sampling paths, and trust propagation all deserve scrutiny.
- MCPTox shows that poisoning tool metadata works on real servers and authentic tools, not just synthetic examples.
- MCP-ITP matters because the poisoned tool itself may never run. The metadata can still steer a different legitimate high-privilege tool.
- Caller Identity Confusion is the strongest 2026 source for why OAuth alone is not enough if the server does not bind execution authority to the real caller and session.
- Instruction Hierarchy remains the cleanest conceptual bridge: outside content, tool descriptions, and third-party text should not outrank higher-trust instructions just because they are in the context window.
The safest summary is that MCP security is not one problem. It is a combined trust problem across software supply chain, authorization design, runtime containment, and model reasoning over untrusted text.
1. Use trusted discovery and server onboarding
The first mistake is to treat server discovery as a convenience problem instead of a security problem. If a team can connect any useful-looking MCP server from a public directory or repository without review, the attack surface starts before the first prompt is sent.
The strongest practical guidance from OWASP, Docker, Palo Alto, and the ecosystem papers points in the same direction: use a trusted registry or allowlist, record who approved a server, pin versions, and verify that the server you approved is still the server you are running. This is the best defense against typosquatting, malicious uploads, and rug-pull style drift after initial approval.
- Prefer registry-only discovery where possible.
- Verify source, maintainer, signatures, and dependency health before enablement.
- Pin versions and store hashes or checksums for the reviewed build and tool metadata.
- Roll out new or updated servers in staging before promoting them to production.
- Alert on drift when a manifest, description, or server version changes after approval.
For teams using third-party servers, this one step does more than it seems. It narrows the universe of things the model can trust before the model ever sees a tool description.
2. Bind authorization to the actual caller and session
The official MCP authorization guidance makes a good starting point: use OAuth 2.1, treat local clients as public clients, use PKCE, validate redirects exactly, and avoid token passthrough. But the 2026 authorization research makes the more important point: good auth still fails if execution authority is cached and silently reused across callers.
In plain language, a server should not say, "Someone authorized me once, so any later tool invocation is fine." It should say, "This specific caller and this specific session are authorized for this specific scope." That is the difference between authentication existing somewhere in the code and authorization actually being bound to the call that matters.
- Use OAuth 2.1 and PKCE where the official model fits.
- Validate `iss`, `aud`, expiry, signatures, and redirect URIs precisely.
- Never pass through tokens that were not explicitly issued for the MCP server.
- Bind tokens, session state, and caller identity to the active connection, not global server memory.
- Re-authorize when the caller, client, or connection context changes.
If OAuth cannot be used in a specific deployment, the fallback should be narrower, not laxer: short-lived scoped credentials, explicit identity, and tighter approval gates.
3. Constrain runtime permissions and isolate execution
Even a well-reviewed server can still do too much if it runs with too much reach. That is why the official docs and vendor guidance keep returning to isolation, not just correctness. Sandboxing is what limits the damage when the model, the tool, or the approval workflow gets something wrong.
For local connections, prefer STDIO or Unix sockets over unnecessary local network exposure. If you must expose local HTTP, bind to `127.0.0.1`, keep explicit authentication in place, and validate origins. For third-party or higher-risk servers, containers are the most practical default because they reduce filesystem, secret, and network reach without changing the rest of the stack.
- Run third-party or sensitive MCP servers in containers or equivalent sandboxes.
- Use read-only mounts where write access is not needed.
- Limit outbound network access to the minimum required destinations.
- Inject secrets only into the workloads that need them, and keep them narrowly scoped.
- Apply per-session quotas, timeouts, and rate limits so one workflow cannot sprawl uncontrollably.
The point is not to eliminate trust problems. It is to keep a trust failure from turning into shell access, broad file reads, or high-value data exposure.
4. Treat tool metadata and external content as untrusted
This is the layer people mean when they say "prompt injection in MCP," but it needs to be stated carefully. The problem is broader than one bad response. Tool names, descriptions, prompts, resources, and linked content can all act like instructions if the model gives them too much authority.
The official MCP tools specification helps here: tool annotations and related metadata should be treated as untrusted unless they come from a server you trust. The research papers make the same point from the attacker's side. Tool poisoning works because descriptive text can hijack decision-making before the tool is even executed.
This is also why the best practical defenses are not purely model-side. You want manifest review, schema validation, suspicious-text inspection, task compartmentalization, and user-visible previews of sensitive tool inputs. If you need the general background first, start with what prompt injection is and how indirect prompt injection works.
- Review full manifests and descriptions before enablement instead of relying on shortened summaries.
- Use strict schemas for tool inputs and outputs.
- Treat server responses and external resources as untrusted content, not hidden policy.
- Reset or compartmentalize sessions when the task changes, rather than carrying long-lived context forward indefinitely.
- Inspect AI-bound text, files, URLs, and parser-visible metadata before handing them to the agent.
5. Require approvals for sensitive actions
Human approval is not a concession that the system failed. It is a design choice that stops hidden instructions from instantly becoming actions. The safest operational guidance from OWASP, Microsoft, and the official MCP docs all point to the same pattern: preview, confirm, then execute for anything consequential.
This matters most where the action crosses a trust boundary the user cares about: local file access, shell execution, email, browser actions, payments, identity systems, admin APIs, or any new resource the workflow has not touched before. Broad autonomy should be the exception, not the default.
- Keep approvals on for destructive, external, or high-impact actions.
- Preview tool inputs before sensitive execution.
- Require fresh approval when a workflow reaches a new class of data or capability.
- Prefer narrower tasks to open-ended mandates like "handle my inbox" or "do whatever is needed."
The more power an agent has, the more important this step becomes. Prompt injection is much worse when excessive agency is already built into the workflow.
6. Monitor, revalidate, and keep a kill switch
Secure onboarding is not enough if nothing watches what happens after deployment. The practical guides are unusually consistent here: log tool use, log approvals, log unusual reads and writes, and keep enough inventory state that you can tell what changed and who approved it.
This is where security turns from a design document into an operating model. If a server starts making unexpected outbound requests, reading far more files than usual, or changing metadata after approval, you want alerts and a way to disable it quickly.
- Monitor prompt and tool-call telemetry for unusual patterns.
- Keep a reviewed inventory of approved servers, versions, owners, and scopes.
- Re-scan or revalidate servers when versions or descriptions change.
- Promote from staging to production only after a probation period and rollback path exist.
- Maintain a kill switch for compromised or drifting servers.
Where Veridicus Scan fits
A guide like this should not pretend one product solves MCP security. It does not. But some controls do fit naturally into the MCP stack, and Veridicus Scan's MCP guardrail mode is strongest when it is framed as a local intake and review layer, not as a replacement for auth or sandboxing.
If a workflow feeds AI-bound URLs, extracted text, files, or tool-visible content into an agent, one useful control is to inspect that material before the model sees it. If a workflow needs runtime review, methods like Scope Tools, Guard Plan, and Gate Action fit on the oversight side of the stack. The pages on coverage, URL scanning, and report exports show the evidence and export side of that model.
That means Veridicus Scan fits best where MCP risk overlaps with prompt injection, tool poisoning, improper output handling, and excessive agency. It can help reduce what reaches the model and make risky flows easier to review. It does not replace caller-bound authorization, least privilege, or runtime isolation.
FAQ
What are MCP security best practices?
MCP security best practices are the controls that keep AI agents from trusting the wrong servers, reusing the wrong permissions, or obeying the wrong text. The main layers are trusted discovery, caller-bound authorization, runtime isolation, approval gates, and treating tool metadata and external content as untrusted.
Is OAuth enough for MCP security?
No. OAuth is important, but it does not automatically bind every tool call to the real caller or fix runtime overreach. MCP deployments still need session binding, least privilege, sandboxing, confirmation flows, and monitoring.
Should MCP servers run in containers?
Containers are one of the most practical defaults for third-party or higher-risk MCP servers because they reduce blast radius. They help with filesystem isolation, scoped secrets, outbound network control, and reproducible deployment. They still need policy and permission design around them.
How does prompt injection affect MCP servers?
Prompt injection affects MCP when tool descriptions, manifests, prompts, resources, or tool output are treated like instructions instead of data. In MCP that can change tool choice, expose secrets, or trigger high-privilege actions even before the poisoned tool is executed.