AI Security with Lakera: Aligning with OWASP Top 10 for LLM Applications

Discover how Lakera's security solutions correspond with the OWASP Top 10 to protect Large Language Models, as we detail each vulnerability and Lakera's strategies to combat them.

David Haber
December 21, 2023
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

In-context learning

As users increasingly rely on Large Language Models (LLMs) to accomplish their daily tasks, their concerns about the potential leakage of private data by these models have surged.

[Provide the input text here]

[Provide the input text here]

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Lorem ipsum dolor sit amet, Q: I had 10 cookies. I ate 2 of them, and then I gave 5 of them to my friend. My grandma gave me another 2boxes of cookies, with 2 cookies inside each box. How many cookies do I have now?

Title italic

A: At the beginning there was 10 cookies, then 2 of them were eaten, so 8 cookies were left. Then 5 cookieswere given toa friend, so 3 cookies were left. 3 cookies + 2 boxes of 2 cookies (4 cookies) = 7 cookies. Youhave 7 cookies.

English to French Translation:

Q: A bartender had 20 pints. One customer has broken one pint, another has broken 5 pints. A bartender boughtthree boxes, 4 pints in each. How many pints does bartender have now?

Lorem ipsum dolor sit amet, line first
line second
line third

Lorem ipsum dolor sit amet, Q: I had 10 cookies. I ate 2 of them, and then I gave 5 of them to my friend. My grandma gave me another 2boxes of cookies, with 2 cookies inside each box. How many cookies do I have now?

Title italic Title italicTitle italicTitle italicTitle italicTitle italicTitle italic

A: At the beginning there was 10 cookies, then 2 of them were eaten, so 8 cookies were left. Then 5 cookieswere given toa friend, so 3 cookies were left. 3 cookies + 2 boxes of 2 cookies (4 cookies) = 7 cookies. Youhave 7 cookies.

English to French Translation:

Q: A bartender had 20 pints. One customer has broken one pint, another has broken 5 pints. A bartender boughtthree boxes, 4 pints in each. How many pints does bartender have now?

Hide table of contents
Show table of contents

As organizations increasingly integrate Large Language Models (LLMs) into their operations, the Open Web Application Security Project (OWASP) Top 10 has become the go-to reference for understanding and mitigating the primary risks associated with these applications.

Lakera has been at the forefront, aligning our security solutions with these well-established risk areas, and ensuring comprehensive protection for the advanced AI systems that are transforming the business landscape.

Let’s dive into the main risks as identified by OWASP.

How Lakera covers OWASP Top 10 for LLM Applications

LLM01: Prompt Injection

At Lakera, we specialize in addressing both direct and indirect prompt injections and jailbreaks in text and visual formats.

Our API is designed to evaluate the likelihood of prompt injections, delivering  immediate threat assessments for conversational AI applications. We also focus on visual prompt injections, understanding the unique challenges they pose.

Lakera's strength lies in our continuously growing database of over 30 million attacks and the vigilant monitoring of the threat landscape. Our Red Team team actively stress tests foundations models and our own product, and keeps an eye on publicly available jailbreaks, ensuring we're always updated on potential threats.

**💡 Pro Tip: Check out our Prompt Injection Attacks Handbook** 

Our internal Lakera Data Flywheel system is a crucial component of our strategy. It integrates data from Gandalf, production insights, and red team findings. These elements continuously fuel each other, enhancing our security systems to be as robust as possible.

We're committed to rapid updates and improvements to our detectors. We recognize that the threat landscape is ever-changing, and staying ahead of these developments is key to effective security at the enterprise level.

At Lakera, we view prompt injection as a gateway to other vulnerabilities, as listed by OWASP. 

For instance, consider an email plugin designed to summarize messages. If one of these emails contains a prompt injection, it could cause the app to send harmful emails to all contacts. This scenario illustrates how vulnerabilities like insecure plugin design and prompt injections can interconnect.

By constantly red teaming our own security systems, we ensure that we're always a few steps ahead of potential attackers.

LLM02: Insecure Output Handling

In the realm of LLMs like GPT-4, security concerns extend beyond just textual outputs. These advanced models are often integrated into systems where they communicate with other applications.

For example, insecurely handled JSON outputs can become a source of vulnerabilities, leading to potential risks in the systems they interact with.

Lakera’s focus is not just on ensuring that these outputs are valid JSON, with the right fields and correct content, but also on safeguarding the interactions between these outputs and other systems.

At Lakera, while we emphasize the security of conversational applications aimed at human interaction—where content moderation is key—our scope is broader.

We are pioneering in AI-powered applications that talk to each other.

This involves checking the outputs of LLMs for potential vulnerabilities that could lead to such risks as Remote Code Execution (RCE), before these systems interconnect.

Lakera Guard's capabilities are being developed to include thorough checks of LLM outputs, particularly in scenarios where these outputs facilitate interactions between different computer systems.

By proactively addressing these concerns, we aim to set a standard in LLM security, ensuring that as LLMs evolve in their capabilities and applications, their outputs remain secure, trustworthy, and free from vulnerabilities that could compromise entire systems.

LLM03: Training Data Poisoning

At Lakera, the challenge of training data poisoning is a critical focus, especially given its potential to significantly alter the behavior of LLMs.

It's alarming that a small number of poisoned examples can hijack or negatively influence an LLM. This risk escalates when models are fine-tuned, making them susceptible to skewing based on the data they are trained on.

We emphasize the importance of sourcing data from trusted places, especially when data for models is derived from public sources. This step is crucial to ensure the integrity and reliability of the training data.

The Role of Lakera Red:

  • Pre-Training Evaluation: Lakera Red meticulously evaluates data before it is used in training LLMs. This preemptive measure is vital to safeguard against the introduction of biased or harmful content.
  • Development of Protective Measures: Lakera Red specializes in identifying compromised systems from their behavior directly, enabling teams to assess whether their models have been attacked even after fine tuning has occurred.

For example, one of the insidious effects of training data poisoning may be the introduction of bias towards certain populations or the inadvertent "forgetting" of crucial company policies. By evaluating model behaviors, Lakera aims to mitigate these risks, ensuring that LLMs are not only powerful but also responsible and unbiased in their functionalities.

Lakera Red capabilities

Lakera Red plays a pivotal role in this process, ensuring that every piece of training data upholds our high standards of quality and security.

**🛡️ Discover how Lakera’s Red Teaming solutions can safeguard your AI applications with automated security assessments, as well as identifying and addressing vulnerabilities effectively.**

LLM04: Model Denial of Service

Model Denial of Service attacks, akin to traditional Distributed Denial of Service (DDoS) attacks, are a significant concern in the realm of Large Language Models. 

These attacks occur when an LLM is bombarded with numerous complex requests, overwhelming its capacity to respond effectively and hindering its usability by other users. Such attacks are often distributed, originating from various locations worldwide, adding to the complexity of their detection and mitigation.

At Lakera, we monitor user activities to detect usage patterns that may indicate a Model DoS attack. This includes evaluating whether a prompt injection is a genuine user request or part of an orchestrated attack.

Lakera Guard plays a crucial role in protecting LLMs from Model DoS attacks:

  • Blocking Suspicious Users: Lakera Guard has the capability to directly block users exhibiting suspicious behavior, thereby preventing the possibility of model misuse.
  • API Token Management: Users have the option to configure Lakera Guard to block specific API tokens, offering an additional layer of security and control over who can access the LLM.
  • Preventing System Overwhelm: By implementing these measures, Lakera Guard effectively mitigates the risk of the system being overwhelmed by malicious requests.

Through vigilant monitoring and the ability to take decisive action against potential threats, Lakera ensures that LLM systems remain secure and resilient against these types of attacks.

LLM05: Supply Chain Vulnerabilities

In the case of supply chain vulnerabilities, the focus shifts to assessing the model's behavior to determine its functionality and potential risks.

Lakera Red plays a crucial role in scrutinizing various components of LLMs to identify and mitigate supply chain vulnerabilities.

The team examines not just the Python code but also the model weights, data, plugins, and open-source software used in the model's development.

Evaluating the model itself is often the most challenging task. For instance, detecting subtle biases, such as a model exhibiting slightly racist tendencies, requires a nuanced approach.

Lakera Red employs specific prompts designed to test various aspects of a model's behavior. These prompts are crafted to evaluate how well the model aligns with the policies and principles it was built upon.

Lakera Red enables application developers to check their systems against behaviors that are misaligned with their intents, such as model helpfulness, truthfulness, or toxicity. Consistency with these criteria is crucial to ensure the model's reliability and safety.

Lakera’s approach involves in-depth analysis and testing, ensuring that the models not only perform as intended but also adhere to the ethical and operational standards set during their development.


LLM06: Sensitive Information Disclosure (PII)

Lakera Guard's precise detection and protection capabilities are essential for safeguarding Personally Identifiable Information (PII), ensuring both privacy and compliance with regulatory standards.

Lakera Guard goes beyond just detecting PII.

It plays a crucial role in preventing data loss and leakage. This includes safeguarding against prompt leakage, where sensitive data might inadvertently be included in LLM prompts.

A key feature of Lakera Guard is its ability to enforce access controls. This means ensuring that users only access specific documentation and that LLMs do not serve information from resources that users are not authorized to access.

This level of control is vital for maintaining data integrity and preventing unauthorized data exposure.

With robust access control mechanisms, Lakera Guard provides an advanced layer of security.

In addition to all the above, we have developed the Lakera Chrome Extension which serves as a guard against inadvertent sharing of sensitive information. It analyzes inputs to ChatGPT and warns users about personal data like credit card numbers and email addresses.

LLM07: Insecure Plugin Design

Insecure plugin design represents a critical intersection of various AI security issues, such as prompt injection and excessive agency.

This issue becomes particularly prominent in plugins that interact with AI models, where poor design choices can lead to significant vulnerabilities.

Consider a plugin designed to send review emails. If a user asks it to summarize an email, there's a risk that a prompt injected into the email could hijack the plugin.

Often, insecure plugins stem from granting them inappropriate levels of access. For instance, a plugin that needs to read emails but is also given the capability to send emails demonstrates excessive access, leading to potential misuse.

Lakera employs red teaming strategies to assess plugins, focusing on their permission settings and potential vulnerabilities.

  • Balancing Access and Exposure: A key aspect of Lakera’s strategy is determining the 'right level of exposure' for a plugin. This means ensuring that the plugin has enough access to function effectively without exposing the system to unnecessary risks.
  • Systematic Evaluation of Permissions: Lakera’s approach involves a systematic evaluation of a plugin's permissions to confirm that they align with its intended functionality. This helps in preventing scenarios where a plugin could inadvertently expose sensitive data or become a vector for attacks.

Insecure plugin design is a multifaceted challenge in the AI security landscape, requiring a nuanced approach to balance functionality and safety.

LLM08: Excessive Agency

The concept of “excessive agency” in AI system design refers to the risks associated with granting too much autonomy or power to AI models.

When an AI model is given overly broad access or control, predicting and managing potential issues becomes increasingly challenging. It's crucial to strike the right balance in system design to maintain oversight and control.

Lakera Red employs red teaming strategies specifically tailored to AI systems that address these concerns.

It conducts comprehensive scans of AI systems to identify and flag any vulnerabilities that might arise from excessive agency. This involves assessing the extent of access and control granted to the AI models.

Part of this process includes scrutinizing the system design to ensure that the AI model's power and access are appropriately limited. This evaluation helps in maintaining a safe and predictable system behavior.

Through careful vulnerability assessments, Lakera ensures that AI systems are both effective and secure, with the right balance of autonomy and oversight.

LLM09: Overreliance

The risks associated with excessive dependence on LLMs in various applications extend beyond their immediate functionalities.

Particularly in security realms, the danger lies not only in relying too heavily on these AI-driven systems but also in over-trusting the protective measures in place.

Such complacency can result in overlooked vulnerabilities and a misguided sense of safety. To counter this, it's essential to adopt a proactive stance towards security—one that involves continuous monitoring and validation of both the LLMs and their defensive mechanisms.

This approach aligns with Lakera's strategy of "making security visible," emphasizing the critical need to "trust, but verify" in the realm of AI and LLM applications.

To combat overreliance, Lakera has developed a range of tools and features:

  • Flagging Vulnerabilities: Lakera's systems are designed to continuously monitor and flag any potential vulnerabilities. This proactive approach helps in identifying and addressing security issues before they escalate.
  • Security Alarms and Dashboards: Lakera provides intuitive dashboards that offer real-time insights into the security status of AI systems. These dashboards are coupled with security alarms that alert users to any unusual or potentially harmful activities.
  • Regular Security Alerts: In addition to real-time monitoring, Lakera sends out regular security alerts. These alerts keep users informed about the latest security findings and potential risks, ensuring that they are always aware of the current state of their defenses.

By providing these tools, Lakera encourages users to avoid complacency. The dashboards and alerts serve as constant reminders that while AI systems are powerful, they still require human oversight and governance.

Lakera’s approach to addressing overreliance in AI security systems emphasizes the importance of visibility and vigilance.

By offering tools like security dashboards and alerts, Lakera enables users to maintain an active role in overseeing their AI systems.

LLM10: Model Theft

In the AI industry, the risk of model theft is a significant concern, highlighting the need for robust cybersecurity practices.

Lakera addresses this issue by blending traditional cybersecurity measures with advanced system checks, thorough audits, and comprehensive customer education.

An example of a critical vulnerability is storing unreleased model weights in a public S3 bucket. Such practices greatly increase the risk of model theft.

To counter such mishaps, Lakera conducts regular, thorough audits of storage and data handling practices. These audits are designed to identify potential vulnerabilities, ensuring that any gaps in security are promptly addressed.

Lakera places a strong emphasis on educating our customers about the importance of cybersecurity in AI model development and maintenance. We provide resources and training to help customers understand and implement best practices.

By prioritizing secure storage, conducting regular audits, performing advanced system checks, and educating both customers and internal teams, Lakera ensures the protection of AI models against theft and unauthorized access.

What’s Next?

In a world where AI's potential is matched only by the complexity of its threats, Lakera stands ready to support your company's quest for security.

Reach out to us, and together we'll ensure your LLM applications are not only powerful and intelligent but also safe, secure, and trusted!

You can get started for free with Lakera Guard or book a demo here.

OWASP Top 10 for LLM Applications
Lakera LLM Security Playbook
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

David Haber
Read LLM Security Playbook

Learn about the most common LLM threats and how to prevent them.

You might be interested
min read
AI Security

The Rise of the Internet of Agents: A New Era of Cybersecurity

As AI-powered agents go online, securing our digital infrastructure will require a fundamental shift in cybersecurity.
David Haber
February 8, 2024
min read
AI Security

A Guide to Personally Identifiable Information (PII) and Associated Risks

Explore the critical role of Personally Identifiable Information (PII) in today's AI-driven digital world. Learn about PII types, risks, legal aspects, and best practices for safeguarding your digital identity against AI threats.
Brain John Aboze
January 25, 2024
untouchable mode.
Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Join our Slack Community.

Several people are typing about AI/ML security. 
Come join us and 1000+ others in a chat that’s thoroughly SFW.