Cookie Consent
Hi, this website uses essential cookies to ensure its proper operation and tracking cookies to understand how you interact with it. The latter will be set only after consent.
Read our Privacy Policy


Inference is the process by which a trained machine learning model applies its learnt patterns and knowledge to new and unseen data. In other words, it is the application phase where the model uses its training to predict outcomes, classify data, or make decisions based on inputs it has not previously seen.

How Inference works

The inference process begins after the machine learning model has been trained, validated, and optimized with a training dataset.

Once the model is trained and fine-tuned, it is then ready for inference. At this stage, new and unseen data is inputted into the model which then makes predictions or classifications based on what it has learned during training. For example, it might receive a new email and classify it as spam or non-spam.

It is important to note that inference needs to be performed under real-time constraints and resource limitations in production environments. As a result, optimizing for speed and efficiency is crucial for machine learning inference. This might involve techniques like model pruning, quantization, or hardware acceleration.

In summary, machine learning inference is about applying a trained model to new data in the real world, making it a critical aspect of practical machine learning applications.

Lakera LLM Security Playbook
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

Related terms
untouchable mode.
Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Join our Slack Community.

Several people are typing about AI/ML security. 
Come join us and 1000+ others in a chat that’s thoroughly SFW.