Abstract
Clustering, or otherwise known as cluster analysis, is a learning problem that takes place without any human supervision. This technique has often been utilized, much efficiently, in data analysis, and serves for observing and identifying interesting, useful, or desired patterns in the said data. The clustering technique functions by performing a structured division of the data involved, in similar objects based on the characteristics that it identifies. This process results in the formation of groups, and each group that is formed, is called a cluster. A single said cluster consists of objects from the data, that have similarities among other objects found in the same cluster, and resemble differences when compared to objects identified from the data that now exist in other clusters. The process of clustering is very significant in various aspects of data analysis, as it determines and presents the intrinsic grouping of objects present in the data, based on their attributes, in a batch of unlabeled raw data. A textbook or otherwise said, good criteria, does not exist in this method of cluster analysis. That is because this process is so different and so customizable for every user, that needs it in his/her various and different needs. There is no outright best clustering algorithm, as it massively depends on the user’s scenario and needs. This paper is intended to compare and study two different clustering algorithms. The algorithms under investigation are k-mean and mean shift. These algorithms are compared according to the following factors: time complexity, training, prediction performance and accuracy of the clustering algorithms.