Version 1
: Received: 27 September 2024 / Approved: 29 September 2024 / Online: 30 September 2024 (04:10:40 CEST)
How to cite:
Nassih, R.; Berrado, A. Advancing Subgroup Discovery in Classification: A Novel Random PRIM-Based Classifier Compared with well-established Algorithms. Preprints2024, 2024092331. https://doi.org/10.20944/preprints202409.2331.v1
Nassih, R.; Berrado, A. Advancing Subgroup Discovery in Classification: A Novel Random PRIM-Based Classifier Compared with well-established Algorithms. Preprints 2024, 2024092331. https://doi.org/10.20944/preprints202409.2331.v1
Nassih, R.; Berrado, A. Advancing Subgroup Discovery in Classification: A Novel Random PRIM-Based Classifier Compared with well-established Algorithms. Preprints2024, 2024092331. https://doi.org/10.20944/preprints202409.2331.v1
APA Style
Nassih, R., & Berrado, A. (2024). Advancing Subgroup Discovery in Classification: A Novel Random PRIM-Based Classifier Compared with well-established Algorithms. Preprints. https://doi.org/10.20944/preprints202409.2331.v1
Chicago/Turabian Style
Nassih, R. and Abdelaziz Berrado. 2024 "Advancing Subgroup Discovery in Classification: A Novel Random PRIM-Based Classifier Compared with well-established Algorithms" Preprints. https://doi.org/10.20944/preprints202409.2331.v1
Abstract
Machine learning algorithms have made significant strides, achieving high accuracy in many applications. However, traditional models often need large datasets, as they typically peel sub-stantial portions of the data in each iteration, complicating classifier development without suffi-cient data. In critical fields like healthcare, there is a growing need to identify and analyze small yet significant subgroups within data. To address these challenges, we introduce a novel classifier based on the Patient Rule Induction Method (PRIM), a subgroup discovery algorithm. PRIM finds rules by peeling minimal data at each iteration, enabling the discovery of highly relevant regions. Unlike traditional classifiers, PRIM requires experts to select input spaces manually. Our inno-vation transforms PRIM into an interpretable classifier by starting with random input space se-lections for each class, then pruning rules using Metarules, and finally selecting definitive rules for the classifier. Tested against popular algorithms such as Random Forest, Logistic Regression, and XGBoost, our Random PRIM-based Classifier (R-PRIM-Cl) demonstrates comparable robustness, superior interpretability, and the ability to handle categorical and numeric variables. It discovers more rules in certain datasets, making it valuable especially in fields where understanding the model's decision-making process is as important as its predictive accuracy.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.