K-means clustering (MacQueen 1967) is one of the most commonly used unsupervised machine learning algorithm for partitioning a given data set into a set of k groups (i.e. k clusters), where k represents the number of groups pre-specified by the analyst.It classifies objects in multiple groups (i.e., clusters), such that objects within the same cluster are as similar as possible (i.e., high.
Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 Abstract Principal component analysis (PCA) is a widely used statistical technique for unsuper-vised dimension reduction. K-means cluster-ing is a commonly used data clustering for unsupervised learning tasks. Here we prove that principal components are the continuous solutions to the discrete cluster.