Preprint
Article

A Fast K-prototypes Algorithm Using Partial Distance Computation

Altmetrics

Downloads

1047

Views

819

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

17 April 2017

Posted:

17 April 2017

You are already at the latest version

Alerts
Abstract
The k-means is one of the most popular and widely used clustering algorithm, however, it is limited to only numeric data. The k-prototypes algorithm is one of the famous algorithms for dealing with both numeric and categorical data. However, there have been no studies to accelerate k-prototypes algorithm. In this paper, we propose a new fast k-prototypes algorithm that gives the same answer as original k-prototypes. The proposed algorithm avoids distance computations using partial distance computation. Our k-prototypes algorithm finds minimum distance without distance computations of all attributes between an object and a cluster center, which allows it to reduce time complexity. A partial distance computation uses a fact that a value of the maximum difference between two categorical attributes is 1 during distance computations. If data objects have m categorical attributes, maximum difference of categorical attributes between an object and a cluster center is m. Our algorithm first computes distance with only numeric attributes. If a difference of the minimum distance and the second smallest with numeric attributes is higher than m, we can find minimum distance between an object and a cluster center without distance computations of categorical attributes. The experimental shows proposed k-prototypes algorithm improves computational performance than original k-prototypes algorithm in our dataset.
Keywords: 
Subject: Computer Science and Mathematics  -   Computer Science
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated