Version 1
: Received: 6 November 2024 / Approved: 7 November 2024 / Online: 7 November 2024 (07:13:30 CET)
How to cite:
Prasad, M.; T, S. Clustering Accuracy Improvement Using Modified Min-Max Normalization Technique. Preprints2024, 2024110486. https://doi.org/10.20944/preprints202411.0486.v1
Prasad, M.; T, S. Clustering Accuracy Improvement Using Modified Min-Max Normalization Technique. Preprints 2024, 2024110486. https://doi.org/10.20944/preprints202411.0486.v1
Prasad, M.; T, S. Clustering Accuracy Improvement Using Modified Min-Max Normalization Technique. Preprints2024, 2024110486. https://doi.org/10.20944/preprints202411.0486.v1
APA Style
Prasad, M., & T, S. (2024). Clustering Accuracy Improvement Using Modified Min-Max Normalization Technique. Preprints. https://doi.org/10.20944/preprints202411.0486.v1
Chicago/Turabian Style
Prasad, M. and Srikanth T. 2024 "Clustering Accuracy Improvement Using Modified Min-Max Normalization Technique" Preprints. https://doi.org/10.20944/preprints202411.0486.v1
Abstract
Clustering algorithm such as k-Means is highly sensitive to the scale of input features. A common approach to mitigate this issue is Min-Max scaling normalization, which rescales feature values to a specified range. This paper investigates an alternative form of Min-Max scaling, where the normalization is based on both the minimum (Xmin) and mean (Xmean) of the feature, rather than the maximum value. The proposed method is shown to be particularly effective in improving clustering accuracy for datasets with varying scales and distributions. Experimental results demonstrate that using this modified Min-Max scaling approach leads to better-defined clusters, enhanced performance in terms of clustering accuracy, and reduced bias in distance-based clustering algorithms. We validate the method using several standard clustering techniques, including k-Means on publicly available datasets.
Keywords
Clustering Algorithms; K-Means Clustering; Clustering Performance Metrics; Silhouette Score; Cluster Validity; Impact of Scaling on Clustering; Feature Importance in Clustering; Confusion matrix; Accuracy
Subject
Computer Science and Mathematics, Computer Science
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.