1
Input & Context

The user asks the agent to “Forget what you’ve been told…” As all text ( system prompt, context data and user query ) reaches the model as a single text block, the attacker is attempting to ask the model to forget its system prompt guardrails or restrictions to allow the model to explain something normally beyond its ethical boundaries.  In this case, an attempt to influence an upcoming election.

Policy snippet (copy/paste)
"data": {
"name": "Primary Policy",
"policy_mode": "IO",
"input_detectors": [
{
"type": "prompt_attack",
"threshold": "l1_confident"
},
{
"type": "moderated_content/hate",
"threshold": "l2_very_likely"
},
{
"type": "pii/address",
"threshold": "l2_very_likely"
},
2
Lakera Decision

Our Prompt Defense guardrails detect both the instruction override (“Forget what you’ve been told”) and the sensitive policy topic (elections).

Lakera can detect the malicious intent before the prompt reaches the model and logs the attempt for review.  This allows the application to respond both with appropriate message and monitor the user session for further malicious behaviour.

Log & audit fields
{
  "flagged": true,
  "timestamp": "2025-11-26T12:35:22Z",
  "breakdown": [
   {
      "project_id": "project-7539648934",
      "policy_id": "policy-a2412e48-42eb-4e39-b6d8-8591171d48f2",
      "detector_id": "detector-lakera-pinj-input",
      "detector_type": "prompt_attack",
      "detected": true,
      "message_id": 0
    },
   {
      "project_id": "project-7539648934",
      "policy_id": "policy-a2412e48-42eb-4e39-b6d8-8591171d48f2",
      "detector_id": "detector-lakera-moderation-21-input",
      "detector_type": "moderated_content/crime",
      "detected": true,
      "message_id": 0
    },

How Lakera stops
the attacks

Real time protection against prompt injections, data loss, and other emerging threats to your LLM applications.
Real-Time, Context-Aware Detection

Catch instruction overrides, jailbreaks, indirect injections, and obfuscated prompts as they happen, before they reach your model.

Enforcement You Control

Block, redact, or warn. Fine-tune with allow-lists and per-project policies to minimize false positives without weakening protection.

Precision & 
Adaptivity

Lakera Guard continuously learns from 100K+ new adversarial samples each day. Adaptive calibration keeps false positives exceptionally low.

Broad Coverage

Protects across 100+ languages and evolving multimodal patterns, with ongoing support for image and audio contexts.

Enterprise-Ready

Full audit logging, SIEM integrations, and flexible deployment options, SaaS or self-hosted, built for production-scale GenAI systems.

Works seamlessly with enterprise environments

Optimized for your infrastructure
Lakera provides seamless integrations 
for all your use cases
Integrate with existing analytics,
monitoring and security stack
Lakera works with Grafana, Splunk, 
and more
Enterprise-grade security
Built to meet highest standards 
including  SOC2, EU GDPR, and NIST

Frequently asked questions

How does Lakera Guard detect and stop prompt injection attacks?

Lakera Guard analyzes every input and output in real time to spot hidden or conflicting instructions that could override your model’s behavior. It flags or blocks prompt injections before they reach the model, protecting against both direct and indirect attacks.

Can Lakera Guard detect prompt attacks hidden in documents or links?

Yes. Guard scans fetched content, attachments, and URLs for embedded or indirect instructions, including those hidden in HTML, PDFs, or less common languages, to prevent indirect or link-based prompt injections.

How does Lakera stay ahead of new prompt attack techniques?

Lakera Guard continuously learns from real-world adversarial data, including over 100,000 new attacks analyzed daily through Gandalf, Lakera’s AI security game and research platform. This adaptive threat intelligence keeps your defenses up to date against emerging attack patterns.

Deploy AI with confidence
Get real time protection against prompt injections, data loss, and other emerging threats to your LLM applications in minutes.