Cookie Consent

Hi, this website uses essential cookies to ensure its proper operation and tracking cookies to understand how you interact with it. The latter will be set only after consent.

OpenAI’s CLIP in production

We have released an implementation of OpenAI’s CLIP model that completely removes the need for PyTorch, enabling you to quickly and seamlessly install this fantastic model in production and even possibly on edge devices.

Daniel Timbrell

October 20, 2023

Last updated:

November 13, 2024

On this page

Hide table of contents

Show table of contents

Deploying state-of-the-art machine learning models can often lead to a myriad of issues due to the dependencies of the more salient packages - most commonly PyTorch and TensorFlow. At Lakera, we have released an implementation of OpenAI’s CLIP model that completely removes the need for PyTorch, enabling you to quickly and seamlessly install this fantastic model in production and on edge devices.

CLIP (Contrastive Language-Image Pre-Training) is powering some of the most exciting image to text applications out there right now. It’s a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task, similarly to the zero-shot capabilities of GPT-2 and 3. There are three main components that comprise this model:

The text tokeniser, which converts the given natural language into tokens (embeddings).
The image preprocessor, which converts the given image into embeddings.
The CLIP model itself, which outputs the cosine similarities of the text and image embeddings generated above.

The main issue we have found is that all three of these pieces utilise PyTorch - so we decided to simplify things for you!

We achieved this with the following steps:

The text tokeniser was rewritten in NumPy.
We wrote our own image preprocessor, which mimics the functionality of CLIP’s preprocessor.
We exported the CLIP model to an .onnx format, meaning that we have essentially swapped the PyTorch dependency for the lightweight onnxruntime.

Try it out! Don’t forget to give it a star and reach out if you have any feedback!

‍

Daniel Timbrell

GenAI Security Preparedness
Report 2024

Get the first-of-its-kind report on how organizations are preparing for GenAI-specific threats.

Free Download

Exploring the World of Large Language Models: Overview and List

Explore our list of the leading LLMs: GPT-4, LLAMA, Gemini, and more. Understand what they are, how they evolved, and how they differ from each other.

Brain John Aboze

May 21, 2025

min read

•

Large Language Models

Foundation Models Explained: Everything You Need to Know

Foundation models have taken center stage in conversations, signifying a significant transformation in the field of machine learning approaches. Gain insights into their functioning, practical applications, constraints, and the hurdles involved in adopting them to your specific use case.

Deval Shah

May 21, 2025

Activate
untouchable mode.

Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Book a demo Start for free

Join our Slack Community.

Several people are typing about AI/ML security.  Come join us and 1000+ others in a chat that’s thoroughly SFW.

Join Lakera Momentum Slack

OpenAI’s CLIP in production

Unlock Free AI Security Guide.

Explore Prompt Injection Attacks.

Learn AI Security Basics.

Evaluate LLM Security Solutions.

Uncover LLM Vulnerabilities.

The CISO's Guide to AI Security