The Progressive Breach Model Behind the OWASP Top 10 for Agentic Applications

AI Security

5

min read

February 20, 2026

Steve Giguere

The 2025 OWASP Top 10 for LLM Applications is about how models are manipulated. The 2026 OWASP Top 10 for Agentic Applications is about what happens when that manipulation is given autonomy.

That distinction matters.

The LLM Top 10 focuses on prompts, outputs, and model behavior. It asks how an attacker can influence what a model says. The Agentic Top 10 moves one layer higher. It asks what happens when that model is allowed to act. When it has memory. When it calls tools. When it operates under real credentials. When it coordinates with other agents.

At that point, prompt injection is no longer a clever jailbreak. It becomes a way to redirect decisions. Data poisoning stops being a training concern and becomes a persistent memory problem. Excessive agency turns into a privilege escalation pathway.

Read casually, the Agentic Top 10 looks like another checklist. Ten risks. Ten mitigations. Another PDF.

Read carefully, it describes a progression:

Compromise the mind.
Convert autonomy into power.
Allow the system to propagate.
Lose containment.

The real value of the Agentic Top 10 is not in the individual entries. It is in how they connect. A small manipulation at the language layer can evolve into autonomous action, cross-agent spread, and eventually system-wide failure.

That progression is what makes agentic systems fundamentally different from traditional LLM applications.

And that is the story we are about to unpack.

-db1-

TL;DR

The OWASP Top 10 for Agentic Applications is not just a list of risks. It describes a progressive breach model driven by autonomy.
Traditional LLM risks like prompt injection and data poisoning evolve into goal hijacking, memory corruption, and unsafe autonomous actions once agents can act.
Agentic breaches typically follow a progression: compromised intent → operational power via tools and credentials → cross-agent propagation → cascading failures and loss of containment.
The real security shift is amplification. A small manipulation at the language layer can scale into system-wide impact when agents have memory, tools, and coordination.
Securing agentic systems therefore requires containment and blast-radius thinking, not just input filtering or model-level guardrails.

-db1-

Risk Evolution

From Model Manipulation to Autonomous Breach

The Agentic Top 10 does not replace the LLM Top 10. It extends it. The vulnerability remains, but the blast radius expands once autonomy is introduced.

LLM Risk	Agentic Evolution
Prompt Injection	Goal Hijack / Memory Poisoning
Excessive Agency	Tool Misuse / Identity & Privilege Abuse
Improper Output Handling	Unexpected Code Execution
Data & Model Poisoning	Persistent Memory Corruption
Misinformation	Human–Agent Trust Exploitation

Phase 1: Compromise the Mind

Every agentic breach begins the same way. The attacker does not start by breaking authentication or exploiting an API. They start by changing what the agent believes.

In traditional LLM systems, this shows up as prompt injection. The model is tricked into producing an unsafe output. That is serious, but it is usually bound to a single interaction.

In agentic systems, the same technique becomes far more consequential.

Agents ingest more than user prompts. They read documents, retrieve RAG content, parse tool outputs, process emails, and consume messages from other agents. All of that flows into the same natural-language layer that shapes intent and planning. There is no reliable separation between “data” and “instruction” at that layer.

A sentence buried in a PDF can influence the same reasoning process as a system prompt.

This is where OWASP places Agent Goal Hijack. The attacker does not need to override the agent’s core instructions. They only need to redirect how the agent interprets its objective. A poisoned document can subtly reweight priorities. A malicious calendar invite can alter task selection. A tool response can smuggle in new constraints. The agent still appears aligned. Its internal objective has shifted.

That shift becomes more durable when memory is involved.

Agents summarize conversations. They store embeddings. They retrieve prior outputs as context for future decisions. When attacker-controlled content enters that memory layer, it does not disappear after one run. It persists. It gets re-summarized. It influences new plans. OWASP calls this Memory and Context Poisoning, and it frequently becomes the mechanism that locks a hijacked goal in place.

At this stage, nothing looks compromised. The agent still follows its system prompt. It still uses approved tools. Logs show normal API calls. The only thing that changed is the agent’s internal model of what it is supposed to optimize for.

That is the first and most important transition in the Agentic Top 10.

Manipulate the language layer in an LLM, and you influence a response. Manipulate the belief layer in an agent, and you influence behavior.

The mind has been compromised. The system just does not know it yet.

**In a recent internal Lakera hackathon, we stress-tested real attack scenarios inside an OpenClaw-style agent ecosystem. The goal: explore how agentic threats actually manifest, beyond theory.

These three deep dives document what we found:

OpenClaw, Skills, and the “Lord of the Flies” Problem How ungoverned skill ecosystems create incentive misalignment, weak trust boundaries, and emergent security chaos.
Memory Poisoning: From Discord Chat to Reverse Shell How instruction drift and poisoned agent memory can escalate into full command execution.‍
The Agent Skill Ecosystem: When Extensions Become a Malware Delivery Channel How malicious skills introduce supply-chain risk and hidden execution paths inside agent frameworks. **

Phase 2: Convert Autonomy into Power

Compromising intent is only the first step. The real risk appears when that compromised intent is given execution rights.

This is the structural difference between LLM systems and agentic systems. An LLM generates text. An agent generates actions.

Once autonomy is in play, the manipulated belief from Phase One begins to drive real operations. Agents call APIs. They trigger workflows. They execute scripts. They move money. They provision infrastructure. They modify configurations. They do this under valid credentials and within approved integrations.

This is where two OWASP categories become central: Tool Misuse and Exploitation and Identity and Privilege Abuse.

In the LLM Top 10, “Excessive Agency” describes what happens when a model is allowed to act too freely. In agentic systems, that freedom becomes leverage. A hijacked goal now has access to systems that matter.

-db1-The shift is subtle but critical:

A poisoned instruction is no longer just unsafe text. It becomes an API call.
A skewed memory entry is no longer just bias. It becomes a workflow decision.
A hallucinated output is no longer just misinformation. It can become executed code.

In practice, this looks disturbingly normal:

A finance agent prepares a transfer through an approved payments API. The credentials are valid. The endpoint is correct. The destination account was “validated” earlier in memory.
A coding agent pulls a dependency and runs a build step as part of routine maintenance. The package resolves successfully. The backdoor executes inside the build pipeline.
A security automation agent aggregates logs across systems. The tool chain is legitimate. The destination endpoint was quietly influenced upstream.

-db1-

Nothing in these flows breaks authentication. Nothing bypasses a firewall. The agent is operating exactly as designed. The only variable that changed is the objective driving those actions.

OWASP classifies this as Identity and Privilege Abuse because the agent often operates as a delegated principal. It inherits access from users, service accounts, or other agents. When the intent layer is compromised, that inherited privilege becomes an amplification mechanism.

At this point, the breach has crossed a threshold. It is no longer about language manipulation. It is about operational authority.

The system has moved from compromised reasoning to compromised execution.

And once execution is automated, containment becomes harder with every additional workflow that trusts the output.

Phase 3: Allow the System to Propagate

Up to this point, the compromise can still be misunderstood as a single-agent failure. A poisoned memory. A misused tool. A delegated credential gone wrong.

Agentic systems rarely operate in isolation.

Modern deployments rely on networks of planners, executors, retrievers, reviewers, and domain-specific helpers. Agents pass tasks to one another. They exchange context. They register capabilities through discovery services. They trust responses from peers inside the system boundary.

That architecture is what enables scale. It is also what enables spread.

OWASP captures this under Insecure Inter-Agent Communication and Agentic Supply Chain Vulnerabilities.

When agents communicate, they often treat internal messages as trustworthy by default. A planning agent issues instructions. An execution agent carries them out. A helper agent advertises a capability and gets routed traffic. If identity, intent, and message integrity are not strongly bound together, a compromised agent can influence others without ever breaching them directly.

-db1-Propagation can take several forms:

A low-privilege agent relays a request that inherits higher privileges downstream.
A malicious tool descriptor or MCP endpoint advertises capabilities that cause multiple agents to route data through it.
A compromised update in a shared registry spreads across agents that dynamically load tools at runtime.
A poisoned memory entry becomes shared context across multiple workflows.

The original manipulation now has distribution.-db1-

In traditional systems, compromise often requires lateral movement through explicit exploitation. In agentic systems, lateral movement can happen through normal coordination. Agents are designed to pass work along. They are designed to trust structured outputs from peers. They are designed to reuse context.

That design goal becomes the propagation channel.

This is also where supply chain risk changes shape. In static software, a compromised dependency spreads when deployed. In agentic ecosystems, tools, prompts, and capabilities can be loaded dynamically at runtime. A poisoned component does not need a full redeploy to spread. It can be discovered and trusted on demand.

The breach is no longer contained to a single objective. It now influences multiple agents, multiple workflows, and potentially multiple domains.

The system is teaching itself the attacker’s assumptions.

And once that happens, containment becomes exponentially harder.

Framework Overview

OWASP Top 10 for Agentic Applications (2026)

A concise summary of the highest-impact threats facing autonomous AI systems, tools, and multi-agent coordination.

Threat	Description	Quick Example
ASI01: Goal Hijack	Manipulation of instructions to redirect an agent's objectives or decision pathways.	Malicious emails triggering silent data exfiltration via Copilot.
ASI02: Tool Misuse	Agents applying legitimate tools in unauthorized, unsafe, or unintended ways.	An email summarizer being tricked into deleting production records.
ASI03: Identity Abuse	Exploitation of dynamic trust and delegation to escalate access and bypass controls.	Low-privilege agents inheriting excessive rights from high-privilege managers.
ASI04: Supply Chain	Malicious artifacts, third-party agents, or tools dynamically loaded into the execution chain.	A backdoored MCP server secretly BCC'ing organization emails to attackers.
ASI05: Unexpected RCE	Conversion of natural language into adversarial code execution or container escapes.	"Vibe coding" agents executing unreviewed shell commands that wipe data.
ASI06: Memory Poisoning	Corruption of persistent context or retrievable knowledge to bias future reasoning.	Attacking RAG sources to implant false refund policies in a finance agent.
ASI07: Insecure Inter-Agent	Weak controls on exchanges between coordinating agents allowing message manipulation.	MITM attacks on unauthenticated message buses hijacking task coordination.
ASI08: Cascading Failures	Propagation of a single fault across autonomous agents causing system-wide harm.	Financial trading agents network-wide acting on a single poisoned risk limit.
ASI09: Trust Exploitation	Manipulation of humans through fluency, perceived authority, or "fake explainability."	An agent fabricating audit rationales to trick analysts into deleting a database.
ASI10: Rogue Agents	Malicious or compromised agents that deviate from their scope to sabotage operations.	A hijacked agent autonomously spawning replicas to consume cloud resources.

Phase 4: Lose Containment

Once compromised intent has been operationalized and allowed to propagate, the system enters its most dangerous phase.

Loss of containment does not happen in a single moment. It emerges when small, automated decisions begin to reinforce each other.

OWASP calls this Cascading Failures.

Cascading failures are not the initial vulnerability. They are what happens when compromised agents continue to plan, execute, delegate, and learn without interruption. One altered decision becomes many. One automated action triggers a chain of dependent workflows. One poisoned assumption spreads across domains.

At this stage, the system is no longer responding to an attacker. It is responding to its own corrupted state.

-db1-The warning signs are operational, not linguistic:

Identical actions fanning out across multiple services in seconds.
Agents repeating each other’s outputs as trusted input.
Privileged workflows executing at scale under valid credentials.
Cross-domain effects where a decision in one subsystem reshapes behavior in another.

-db1-

Each action appears justified in isolation. Together, they form a feedback loop.

A planning agent adjusts parameters based on skewed data. Execution agents follow the updated plan. Oversight agents see policy compliance and allow it through. Memory persists the outcome. Future plans treat the corrupted result as ground truth.

Nothing in this sequence requires malware. Nothing requires broken authentication. The system remains internally consistent. It is simply optimizing around the wrong objective.

This is the real insight of the Agentic Top 10.

The risks are not independent categories. They describe a progression. Compromise intent. Convert autonomy into power. Enable propagation. Lose containment.

By the time cascading failures appear, the original prompt injection or poisoned document is often irrelevant. The system is now driving its own amplification.

That is the shift from model manipulation to autonomous breach.

** How far can agentic workflows be pushed? In this research, we demonstrate a zero-click remote code execution chain that turns normal MCP integrations and AI coding assistants into a scalable attack vector, no user interaction required.

‍Zero-Click Remote Code Execution: Exploiting MCP & Agentic IDEs A silent Google Doc share triggers prompt injection, automatic payload retrieval, credential theft, and persistent reverse shell access, all through intended agentic IDE functionality.**

Why This Model Matters

The OWASP Agentic Top 10 is not simply a taxonomy of risks. It is a model for how autonomy changes the shape of failure.

In traditional LLM systems, manipulation often ends at output. A model produces something unsafe. A guardrail blocks it. A human reviews it. The blast radius is limited.

In agentic systems, the same manipulation becomes the first stage of execution. A poisoned instruction becomes a goal. A skewed memory entry becomes a planning input. A misaligned output becomes a workflow trigger. Each step builds on the previous one.

That progression is the insight.

The Agentic Top 10 matters because it forces teams to think in phases rather than categories. Not “Do we block prompt injection?” but “If intent is compromised, how far can it travel?” Not “Are our tools secured?” but “What happens when a compromised objective uses them?” Not “Do agents authenticate to each other?” but “How quickly can a bad decision replicate across the system?”

Seen through that lens, the Top 10 stops being a checklist and becomes a containment framework.

Compromise intent.
Convert autonomy into power.
Enable propagation.
Lose containment.

Once autonomy is introduced, security is no longer about filtering inputs. It is about limiting amplification.

That is the real shift the Agentic Top 10 is trying to capture.

The Lakera team has accelerated Dropbox’s GenAI journey.

Not sure how to secure your GenAI application?
Skip the guesswork with expert-recommended policies built by Lakera’s AI security team. Apply them in seconds, fine-tune when you’re ready, and get started with real protection from day one.

Download the Guide

On this page

Text Link

Hide table of contents

Show table of contents

The Progressive Breach Model Behind the OWASP Top 10 for Agentic Applications

TL;DR

Phase 1: Compromise the Mind

Phase 2: Convert Autonomy into Power

Phase 3: Allow the System to Propagate

Phase 4: Lose Containment

Why This Model Matters

Related Articles