Current Activities
Hierarchical Clustering
Brian T. Luke

In Hierarchical Clustering and Diversity Selection, a threshold value is used to determine when the procedure stops. The number of final clusters, or diverse objects, is not known ahead of time. Conversely, in K-Means and Fuzzy Clustering, the number of clusters is chosen before the procedure starts.

There are two types of hierarchical clustering: Agglomerative and Divisive. In agglomerative clustering, each object is initially placed in its own group. The two "closest" groups are combined into a single group. In divisive clustering, all objects are initially placed into a single group. The two objects that are in the same group but are "farthest" away are used as seed points for two groups. All objects in this group are placed into the new group that has the closest seed. This procedure continues until a threshold distance is reached.

To perform a hierarchical clustering, a measure of the distance between two objects needs to be determined. In addition, an agglomerative clustering needs a measure to determine which groups should be combined, or linked. Some options are simple linkage, average linkage, complete linkage, and Wards method.