Cluster analysis

Pronunciation
/KLUS-tur uh-NAL-uh-sis/
Category
Ecology
Singular
Cluster analysis

Definition

A multivariate statistical method that partitions a dataset into discrete groups (clusters) based on measured similarities among objects, such that within-group resemblance exceeds between-group resemblance. In biological applications, clustering algorithms operate on distance matrices derived from morphological measurements, genetic sequences, data, or behavioral variables to reveal natural groupings without prior classification.

Etymology

From Latin 'cluster' (a bunch or group) and Greek 'analysis' (a breaking up or loosening).

Example

An entomologist studying across 20 forest plots might use hierarchical cluster analysis on abundance data to identify distinct types—perhaps revealing that fogging cluster separately from pitfall trap samples, indicating vertical stratification in the fauna.

Synonyms

  • clustering
  • numerical taxonomy (historical usage)

Related Terms

  • multivariate analysis
  • principal component analysis
  • distance matrix
  • Dendrogram
  • morphometrics
  • beta diversity
  • ordination

Usage Notes

Distinguished from classification (supervised learning) by its unsupervised nature—cluster analysis discovers structure without predefined groups. Common algorithms include hierarchical clustering (agglomerative or divisive), k-means, and DBSCAN. In , phenetic clustering has been largely superseded by cladistic methods, though it persists in and morphometrics. Results are sensitive to choice of distance metric (Euclidean, Manhattan, Bray-Curtis) and linkage criterion. The number of clusters (k) may be determined by silhouette scores, elbow plots, or ecological interpretability.