VOCABULARY

Data Drift

Data drift refers to the change in input data for a predictive model over time which leads to a decrease in the model's performance. This drift is a natural phenomenon in most real-world data because the environment in which these models operate changes all the time. Data drift is a critical aspect to monitor during the lifecycle of a machine learning model, as not accounting for it could lead to sub-optimal predictions or incorrect insights.

Data Drift in practice

Data drift occurs when the statistical properties of the target variable, which the model is trying to predict, change in the unobserved, incoming data. This often happens in dynamic environments where data can change rapidly and unpredictably.

In order to monitor and address data drift, you need to version not only your models but also your data. This means keeping track of which model is trained on which data. Also, keeping track of model accuracy over time will typically show a decrease in performance, indicating that data drift may be occurring.

Moreover, implementing automatic model retraining, alerting, and health checks as part of your machine learning pipeline can help to ensure that models remain accurate even when data drift occurs.

Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

Download now

Related terms

CI/CD

Continuous Validation

Data Versioning

Back to glossary

Activate
untouchable mode.

Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Book a demo Start for free

Join our Slack Community.

Several people are typing about AI/ML security.  Come join us and 1000+ others in a chat that’s thoroughly SFW.

Join Lakera Momentum Slack