Preprint Article Version 1 This version is not peer-reviewed

Enhanced Feature Selection via Hierarchical Concept Modeling

Version 1 : Received: 30 October 2024 / Approved: 31 October 2024 / Online: 4 November 2024 (09:14:14 CET)

How to cite: Muangprathub, J.; Wetchapram, P.; Wanichsombat, A.; Intarasit, A.; Sealee, J.; Boongasame, L.; Choopradit, B. Enhanced Feature Selection via Hierarchical Concept Modeling. Preprints 2024, 2024110024. https://doi.org/10.20944/preprints202411.0024.v1 Muangprathub, J.; Wetchapram, P.; Wanichsombat, A.; Intarasit, A.; Sealee, J.; Boongasame, L.; Choopradit, B. Enhanced Feature Selection via Hierarchical Concept Modeling. Preprints 2024, 2024110024. https://doi.org/10.20944/preprints202411.0024.v1

Abstract

The objectives of feature selection include simplifying modeling and making the results more understandable, improving data mining efficiency, and providing clean and understandable data preparation. With big data it also allows us to reduce computational time, improve prediction performance, and better understand the data in machine learning or pattern recognition applications. In this study, we present a new feature selection approach based on hierarchical concept models using formal concept analysis (FCA) and decision tree (DT) for selecting a subset of attributes. The presented methods are evaluated based on all learned attributes with 10 datasets from the UCI Machine Learning Repository by using three classification algorithms, namely decision trees, support vector machines (SVM), and artificial neural networks (ANN). The hierarchical concept model is built from a dataset, and it is selected by top-down considering features (attributes) node for each level of structure. Moreover, this study is considered to provide a mathematical feature selection approach with optimization based on paired-samples t-test. To compare the identified models in order to evaluate feature selection effects, the indicators used were information gain (IG) and chi-squared (CS), while both forward selection (FS) and backward elimination (BS) were tested with the datasets to assess whether the presented model was effective in reducing the number of features used. The results show clearly that the proposed models when using DT or using FCA, needed fewer features than the other methods for similar classification performance.

Keywords

Formal Concept Analysis; Feature Selection Methods; Hierarchical Concept Model; The Paired-samples T-test, Classification

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.