Cluster analysis
- Pronunciation
- /KLUS-tur uh-NAL-uh-sis/
- Category
- Ecology
- Singular
- Cluster analysis
Definition
A multivariate statistical method that partitions a dataset into discrete groups (clusters) based on measured similarities among objects, such that within-group resemblance exceeds between-group resemblance. In biological applications, clustering algorithms operate on distance matrices derived from morphological measurements, genetic sequences, data, or behavioral variables to reveal natural groupings without prior classification.
Etymology
From Latin 'cluster' (a bunch or group) and Greek 'analysis' (a breaking up or loosening).
Example
An entomologist studying across 20 forest plots might use hierarchical cluster analysis on abundance data to identify distinct types—perhaps revealing that fogging cluster separately from pitfall trap samples, indicating vertical stratification in the fauna.
Synonyms
- clustering
- numerical taxonomy (historical usage)
Related Terms
- multivariate analysis
- principal component analysis
- distance matrix
- Dendrogram
- morphometrics
- beta diversity
- ordination
Usage Notes
Distinguished from classification (supervised learning) by its unsupervised nature—cluster analysis discovers structure without predefined groups. Common algorithms include hierarchical clustering (agglomerative or divisive), k-means, and DBSCAN. In , phenetic clustering has been largely superseded by cladistic methods, though it persists in and morphometrics. Results are sensitive to choice of distance metric (Euclidean, Manhattan, Bray-Curtis) and linkage criterion. The number of clusters (k) may be determined by silhouette scores, elbow plots, or ecological interpretability.