Cookie Consent
Hi, this website uses essential cookies to ensure its proper operation and tracking cookies to understand how you interact with it. The latter will be set only after consent.
Read our Privacy Policy

RAG (Retrieval Augmented Generation)

Retrieval Augmented Generation (RAG) is a method in natural language processing (NLP) that combines the power of language models with information retrieval. In RAG, a query is first processed to retrieve relevant documents or data from a large corpus. This retrieved information is then used to augment the generation process of a language model, enhancing its ability to provide contextually rich and accurate responses.

How RAG Works

In practice, RAG operates in two main stages. First, given an input query, the model searches a large dataset (like Wikipedia) to find relevant documents. This is the retrieval part. Then, these documents are fed into a generative model (like GPT-3) which synthesizes the retrieved information to generate a coherent and contextually appropriate response. This method is particularly effective in scenarios where the language model needs external knowledge or specific information not contained within its pre-trained data.

Lakera LLM Security Playbook
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

Related terms
untouchable mode.
Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Join our Slack Community.

Several people are typing about AI/ML security. 
Come join us and 1000+ others in a chat that’s thoroughly SFW.