Back

AI Security by Design: Lakera’s Alignment with MITRE ATLAS

Developed with MITRE ATLAS in mind, Lakera acts as a robust LLM gateaway, addressing vulnerabilities in data, models, and on the user front, protecting your AI applications against the most prominent LLM threats.

Lakera Team
December 1, 2023
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

In-context learning

As users increasingly rely on Large Language Models (LLMs) to accomplish their daily tasks, their concerns about the potential leakage of private data by these models have surged.

[Provide the input text here]

[Provide the input text here]

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Lorem ipsum dolor sit amet, Q: I had 10 cookies. I ate 2 of them, and then I gave 5 of them to my friend. My grandma gave me another 2boxes of cookies, with 2 cookies inside each box. How many cookies do I have now?

Title italic

A: At the beginning there was 10 cookies, then 2 of them were eaten, so 8 cookies were left. Then 5 cookieswere given toa friend, so 3 cookies were left. 3 cookies + 2 boxes of 2 cookies (4 cookies) = 7 cookies. Youhave 7 cookies.

English to French Translation:

Q: A bartender had 20 pints. One customer has broken one pint, another has broken 5 pints. A bartender boughtthree boxes, 4 pints in each. How many pints does bartender have now?

Lorem ipsum dolor sit amet, line first
line second
line third

Lorem ipsum dolor sit amet, Q: I had 10 cookies. I ate 2 of them, and then I gave 5 of them to my friend. My grandma gave me another 2boxes of cookies, with 2 cookies inside each box. How many cookies do I have now?

Title italic Title italicTitle italicTitle italicTitle italicTitle italicTitle italic

A: At the beginning there was 10 cookies, then 2 of them were eaten, so 8 cookies were left. Then 5 cookieswere given toa friend, so 3 cookies were left. 3 cookies + 2 boxes of 2 cookies (4 cookies) = 7 cookies. Youhave 7 cookies.

English to French Translation:

Q: A bartender had 20 pints. One customer has broken one pint, another has broken 5 pints. A bartender boughtthree boxes, 4 pints in each. How many pints does bartender have now?

Hide table of contents
Show table of contents

The rapid adoption of GenAI across industries has surfaced complex security concerns. Some of these challenges are so novel that even seasoned AI practitioners are uncertain about their full implications.

We know first-hand that it’s not easy to keep pace with emerging AI threats. 

Fortunately, collaborative efforts among AI security researchers, cybersecurity organizations, and leading AI security companies like Lakera, make it possible to establish a structured approach for understanding and addressing the most critical security threats. 

Security frameworks such as OWASP Top 10 for LLM Applications and MITRE's ATLAS have become invaluable resources in the design and development of our own security solutions, aligning with the "secure-by-design" principle we advocate for.

In this article, we explore how Lakera proactively mitigates significant risks associated with adversarial AI, as identified by the ATLAS framework.

Contents:

How Lakera Red and Lakera Guard cover MITRE ATLAS

**💡 Pro tip: To learn how Lakera covers OWASP Top 10 for LLM read Aligning with OWASP Top 10 for LLM Applications.**

However—

Before we dive in, let's first explore what MITRE is and what makes up the MITRE ATLAS framework. 

What is MITRE

In the cybersecurity world, MITRE is a name that requires no introduction, renowned as one of the industry's most prominent organizations.

For those who may not be acquainted with it—MITRE is a not-for-profit organization, backed by the US government, developing standards and tools for addressing industry-wide cyberdefense challenges.

Over the years, MITRE has developed various frameworks, most notably:

At present, one of MITRE's objectives lies in educating the broader cybersecurity community on how to navigate the landscape of threats to machine learning systems. This has led to the development of MITRE ATLAS.

MITRE ATLAS Overview

As stated on MITRE’s website, MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a globally accessible, living knowledge base of adversary tactics and techniques based on real-world attack observations and realistic demonstrations from AI red teams and security groups. 

The framework was initially released in June 2021 with the mission of raising awareness of the unique and evolving AI vulnerabilities, as organizations began to integrate AI into their systems.

To learn more, check out this video introduction.

The MITRE ATLAS framework is modeled after MITRE ATT&CK.

Take a look at an overview of attack tactics and the corresponding attack techniques. The column headers are arranged to illustrate the progression of attack tactics from left to right.

Let’s have a look at them in more detail.

1. Reconnaissance 

The adversary is trying to gather information about the machine learning system they can use to plan future operations. Techniques include:

  • Search for Victim’s Publicly Available Research Materials
  • Search for Publicly Available Adversarial Vulnerability Analysis
  • Search Victim-Owned Websites
  • Search Application Repositories
  • Active Scanning

2. Resource Development 

The adversary is trying to establish resources they can use to support operations. Techniques include:

  • Acquire Public ML Artifacts 
  • Obtain Capabilities
  • Develop Capabilities
  • Acquire Infrastructure
  • Publish Poisoned Datasets
  • Poison Training Data
  • Establish Accounts

3. Initial Access

The adversary is trying to gain access to the machine learning system. Techniques include:

  • ML Supply Chain Compromise
  • Valid Accounts
  • Evade ML Model
  • Exploit Public Facing Application
  • LLM Prompt Injection
  • Phishing

4. ML Model Access

The adversary is attempting to gain some level of access to a machine learning model.  Techniques include:

  • ML Model Inference API access
  • ML-Enabled Product or Service
  • Physical Environment Access
  • Full ML Model Access

5. Execution

The adversary is trying to run malicious code embedded in machine learning artifacts or software. Techniques include:

  • User Execution
  • Command and Scripting Interpreter
  • LLM Plugin Compromise

6. Persistence

The adversary is trying to maintain their foothold via machine learning artifacts or software. Techniques include:

  • Poison Training data
  • Backdoor ML Model
  • LLM Prompt Injection

7. Privilege Escalation

The adversary is trying to gain higher-level permissions. Techniques include:

  • LLM Prompt Injection
  • LLM Plugin Compromise
  • LLM Jailbreak

8. Defense Evasion

The adversary is trying to avoid being detected by machine learning-enabled security software. Techniques include:

  • Evade ML Model
  • LLM Prompt Injection
  • LLM Jailbreak

9. Credential Access

The adversary is trying to steal account names and passwords. Techniques include:

  • Unsecured Credentials

10. Discovery

The adversary is trying to figure out your machine learning environment. Techniques include:

  • Discover ML Model Ontology
  • Discover ML Model Family
  • Discover ML Artifacts
  • LLM Meta Prompt Extraction

11. Collection

The adversary is trying to gather machine learning artifacts and other related information relevant to their goal. Techniques include:

  • ML Artifact Collection
  • Data From Information Repositories
  • Data from Local System

12. ML Attack Staging

The adversary is leveraging their knowledge of and access to the target system to tailor the attack. Techniques include:

  • Create Proxy ML Model
  • Backdoor ML Model
  • Verify Attack
  • Craft Adversarial Data

13. Exfiltration

The adversary is trying to steal machine learning artifacts or other information about the machine learning system. Techniques include:

  • Exfiltration via ML Inference API
  • Exfiltration via Cyber Means
  • LLM Meta Prompt Extraction
  • LLM Data Leakage

14. Impact

The adversary is trying to manipulate, interrupt, erode confidence in, or destroy your machine learning systems and data. Techniques include:

  • Evade ML Model
  • Denial of ML Service
  • Spamming ML System with Chaff Data
  • Erode ML Model Integrity
  • Cost Harvesting
  • External Harms

Each attack technique has its own dedicated page, offering in-depth explanations and case studies that exemplify real-world and academic instances of the discovered techniques.

Finally, let’s explore how Lakera is addressing the adversarial AI risks pinpointed by the ATLAS framework through Lakera Guard and Lakera Red.

Lakera’s Alignment with MITRE ATLAS

Developed with MITRE ATLAS in mind, Lakera Guard, when integrated with Lakera Red, acts as a robust LLM gateaway, addressing vulnerabilities in data, models, but also on the user front, such as in access control systems.

As you can see on the graphic below, we highlighted which of Lakera’s solutions—Lakera Guard and Lakera Red—align with MITRE ATLAS.

Here's a brief overview of Lakera Guard and Lakera Red's capabilities and how they cover AI risks outlined by MITRE.

Lakera Guard 

Relevant for: All 14 MITRE ATLAS tactics.

Lakera Guard is purpose built to monitor, detect, and respond to adversarial attacks on ML models and AI applications, specifically those powered by Large Language Models. Lakera Guard is model-agnostic. You can use it with any model provider (e.g. OpenAI, Anthropic, Cohere), any open-source model, or your custom model.

Lakera Guard is built on top of our continuously evolving security intelligence that empowers developers with industry-leading vulnerability insights. Our proprietary Lakera Data Flywheel system is instrumental in ensuring robust protection for AI applications under the guard of Lakera.

Lakera's threat intelligence database comprises over 30 million attack data points and expands daily by more than 100,000 entries.

Similarly to OWASP, MITRE ATLAS lists prompt injection as the initial access vector for adversaries, setting the stage for further malicious activities. These attacks are used to manipulate LLMs into performing unintended actions or ignoring their original instructions. This vulnerability can trigger a series of LLM-related threats, potentially leading to severe consequences like sensitive data leakage, unauthorized access, and overall security compromise of the application. Prompt injections are used to perform jailbreaks, phishing, or system prompt extraction attacks, which MITRE ATLAS identifies as other techniques that undermine AI application security.

Lakera Guard comes equipped with a set of detectors and powerful capabilities safeguarding LLM applications against threats such as:

  • Prompt injection attacks
  • Phishing
  • PII and data loss
  • Insecure LLM plugins design
  • Model denial of service attacks
  • LLM excessive agency (e.g. access control)
  • Supply chain vulnerabilities
  • Insecure LLM output handling
  • Hallucinations
  • Toxic language output

Here's an overview of Lakera Guard's role within an organization's security infrastructure.

Lakera Guard overview

The way Lakera Guard works is simple—our API evaluates the likelihood of a prompt injection, providing a categorical response and confidence score for real-time threat assessment.

It also supports multi-language detection, and currently provides the most advanced prompt injection detection and defense capabilities on the market. To learn more check out Lakera Guard Prompt Injection Defense.

Take a look at Lakera Guard dashboards that provide context to better understand detected threats and help determine the most appropriate response to protect against them.

MITRE Atlas example case studies that Lakera Guard addresses:

Finally, here's a preview of Lakera Guard in action.

Try Lakera Guard playground to test your prompts.

Lakera Red

Lakera Red is an enterprise-grade AI security product, designed to help organizations identify and address LLM security vulnerabilities before deploying their AI applications to production.

Its capabilities encompass the identification of poisoned training datasets, monitoring both ML model inputs and outputs, detecting vulnerabilities in LLM-powered applications, and assessing the operational risks to which your GenAI applications may be exposed. With data poisoning and ML supply chain vulnerabilities listed by MITRE ATLAS as significant adversarial techniques threatening AI application security, Lakera Red is designed to effectively counter these challenges.

Mitigating AI Risks with Lakera Red:

  • Pre-Training Data Evaluation: Lakera Red evaluates data before it is used in training LLMs. This proactive approach is crucial in preventing the introduction of biased or harmful content into the models, ensuring the integrity and reliability of the training process.
  • Development of Protective Measures: Lakera Red specializes in identifying compromised systems from their behavior directly, enabling teams to assess whether their models have been attacked even after fine tuning has been performed.
  • Access Control Assessment: In its comprehensive scans of AI systems, Lakera Red scrutinizes for vulnerabilities that might stem from excessive agency. It assesses the levels of access and control allocated to AI models, flagging any potential security risks. This process ensures that the AI systems operate within safe and controlled parameters, reducing the risk of unauthorized use or manipulation.
  • Continuous Red-Teaming: Lakera Red offers a continuous, automated stress-testing for AI applications. This is designed to proactively uncover security vulnerabilities both before and after deployment. By simulating real-world attacks and probing for weaknesses, it ensures that AI systems are robust and secure against evolving threats.
Lakera Red capabilities

No matter if you are building customer support chatbots, talk-to-your data internal Q&A systems, content or code generation tools, LLM plugins, or other LLM applications, Lakera Red will ensure they can be deployed securely.

Let's talk about AI security

Ready to start protecting your AI applications? Get in touch with us to talk about AI security tailored to your use case.

You can get started for free with Lakera Guard or book a demo here.

Lakera LLM Security Playbook
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

Lakera Team
Read LLM Security Playbook

Learn about the most common LLM threats and how to prevent them.

Download
You might be interested
6
min read
AI Security

The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women

What is a visual prompt injection attack and how to recognize it? Read this short guide and check out our real-life examples of visual prompt injections attacks performed during Lakera's Hackathon.
Daniel Timbrell
December 1, 2023
min read
AI Security

LLM Vulnerability Series: Direct Prompt Injections and Jailbreaks

of prompt injections that are currently in discussion. What are the specific ways that attackers can use prompt injection attacks to obtain access to credit card numbers, medical histories, and other forms of personally identifiable information?
Daniel Timbrell
December 1, 2023
Activate
untouchable mode.
Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Join our Slack Community.

Several people are typing about AI/ML security. 
Come join us and 1000+ others in a chat that’s thoroughly SFW.