What’s New?

Lakera Guard has been enhanced to detect and prevent inappropriate and harmful content across three key categories:

Violence and Self-Harm

Lakera Guard now flags content related to violent behavior, injury, death, and self-harm. This includes detecting harmful descriptions that could otherwise harm vulnerable users.

Illicit Activities

The latest update enhances the detection of discussions around criminal activities such as fraud, cybercrime, and terrorism. Any attempt to solicit guidance on executing these illegal activities is immediately flagged.

Firearms and Dangerous Weapons

The new update extends moderation to content discussing the use of firearms, explosives, and related weaponry. This ensures your platform remains free from discussions on dangerous and destructive content.

Performance and Flexibility

Lakera Guard’s enhanced content moderation not only adds broader coverage but maintains top-tier performance. The new detectors are highly customizable, allowing you to tailor which categories should be flagged according to your application’s needs.

Despite the additional layers of detection, we’ve ensured that performance remains fast, with only a minimal increase in latency, keeping moderation efficient and responsive.

Why This Matters

AI applications must be prepared to handle all types of input, including dangerous or malicious attempts by users. With Lakera Guard’s expanded content moderation, you can protect your platform from embarrassing, harmful, or even criminal activities.

Whether you’re securing a public-facing AI tool or managing sensitive enterprise systems, these new updates provide the safety net your application needs to ensure compliance and user protection.

Ready to get started?

For more information on Lakera Guard’s new capabilities and how to integrate them, visit our documentation or contact our support team.

‍