Cookie Consent
Hi, this website uses essential cookies to ensure its proper operation and tracking cookies to understand how you interact with it. The latter will be set only after consent.
Read our Privacy Policy
Q4 2025 AI Agent Security Trends

Download Your Content

Get your copy of "Q4 2025 Agent Security Trends Report"

Overview

Explore AI security with the Lakera LLM Security Playbook. This guide is a valuable resource for everyone looking to understand the risks associated with AI technologies.

Ideal for professionals, security enthusiasts, or those curious about AI, the playbook offers insight into the challenges and solutions in AI security.

Highlights

  • Comprehensive Analysis of LLM Vulnerabilities: Detailed overview of critical security risks in LLM applications.
  • Gandalf - The AI Education Game: Introduction to Gandalf, an online game designed for learning about AI security.
  • Expansive Attack Database: Insights from a database of nearly 30 million LLM attack data points, updated regularly.
  • Lakera Guard - Security Solution: Information about Lakera Guard, developed to counteract common AI threats.
  • Practical Security Advice: Tips on data sanitization, PII detection, and keeping up-to-date with AI security developments.

Overview

The Q4 2025 Agent Security Trends Report breaks down how real-world attacks are already targeting early agentic AI systems. Based on a 30-day snapshot of production attack traffic observed by Lakera Guard, the report shows how attacker behavior is evolving as models gain capabilities like tool use, browsing, and structured context handling. Rather than speculative threats, this report focuses on what attackers are actually doing today: what they’re trying to steal, how they succeed, and which techniques show up again and again in practice.

The data reveals a clear shift toward more efficient and harder-to-detect attacks, especially indirect prompt injection delivered through external content that agents are designed to trust. Together, these trends highlight why securing agents requires going beyond output moderation and extending protections across the full agent workflow.

.

Highlights

Key signals from Q4 2025 attack data include:

  • System Prompt Leakage dominates. Nearly 60% of observed attacks attempted to extract system instructions, making configuration targeting the primary attacker objective.
  • Indirect attacks succeed faster. Indirect prompt injections required significantly fewer attempts than direct attacks, appearing across multiple attacker intents.
  • New agent-specific attack surfaces. Tool use, external data ingestion, and script-shaped content introduced entirely new ways to manipulate agent behavior.
  • Role play and obfuscation remain core techniques. Attackers consistently combined techniques like hypothetical scenarios, role play, and obfuscation to bypass safeguards.

Download the full report to explore the data, visuals, and deeper analysis behind these trends, and see what they mean for securing agentic systems heading into 2026.