VOCABULARY

ML Model Validation

Machine Learning Model Validation is the process to confirm that the predictive model functions well with new independent data. It provides verification that the model is generalizable, accurate, reliable and not overfitted to the training dataset. Model validation is an essential step in the Machine Learning process to ensure the model is fit-for-purpose, and its predictions are trustworthy.

How ML Model Validation works

Model validation generally happens in three main phases:

In-sample validation: This validation technique uses the same dataset for both training and testing the model, enabling a check of how well the model has learned the training data.
Out-of-sample validation or Holdout validation: This method divides the dataset into a training set and a test set. The model is trained on the training set and then tested on the test set. This helps evaluate the model's performance on unseen data.
Cross Validation: This technique divides the dataset into ‘k’ folds or subsets. The model is trained on k-1 folds and tested on the left-out fold. This process is repeated k times so that each fold serves as the test set once. The performance measure reported is the average of the values computed in each iteration.

Additionally, techniques like bootstrapping, jackknife and permutation tests can also be used for model validation. These validation techniques help in evaluating if the model can generalize on unseen data, prevent overfitting and underfitting, and assist in hyperparameters tuning to achieve optimal performance. It's worth mentioning that the choice of validation technique should reflect the final use case of the model. For example, if the model needs to generalize well to completely new data, techniques like cross-validation are often more appropriate.

Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

Download now

Related terms

Cross-Validation Modeling

Continuous Validation

Machine Learning

Data Science

Back to glossary

Activate
untouchable mode.

Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Book a demo Start for free

Join our Slack Community.

Several people are typing about AI/ML security.  Come join us and 1000+ others in a chat that’s thoroughly SFW.

Join Lakera Momentum Slack