Density-based clustering refers to a category of clustering algorithms that are based on the notion of density. Unlike partitioning and hierarchical clustering methods, density-based clustering identifies clusters as dense regions of data points separated by regions of lower point density. The main advantage of density-based clustering is its ability to discover clusters of arbitrary shapes and sizes. Some of the most popular density-based clustering algorithms include DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and OPTICS (Ordering Points To Identify the Clustering Structure).
How it works
Density-based clustering algorithms primarily work by growing clusters according to a density criterion. Initially, the algorithm starts with an arbitrary data point. If there are a minimum number of points within a certain distance (eps) from this point, a new cluster is initiated. The algorithm then adds all the points within the 'eps' distance from the initial point to the cluster. Furthermore, for each of those points, the algorithm again checks for a minimum number of points within the 'eps' distance, and if found, adds those to the cluster as well.
This process continues until no more data points can be added to the cluster. After fully traversing one cluster, the algorithm then moves to a new arbitrary point which is not included in the previous cluster and repeats the same process. This results in clusters that are dense regions of data points separated by regions devoid or sparse of data points. Meanwhile, points that do not belong to any cluster is classified as noise or outliers.
Download this guide to delve into the most common LLM security risks and ways to mitigate them.
Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.
Several people are typing about AI/ML security. Come join us and 1000+ others in a chat that’s thoroughly SFW.