Cristea, D.-M.; Sima, I.; Iantovics, L.-B. How Good Perform Logistic Regression Algorithm for Complex Gastroenterological Image Analysis. Comparative Analysis with Physicians Performace. Preprints2024, 2024100683. https://doi.org/10.20944/preprints202410.0683.v1
APA Style
Cristea, D. M., Sima, I., & Iantovics, L. B. (2024). How Good Perform Logistic Regression Algorithm for Complex Gastroenterological Image Analysis. Comparative Analysis with Physicians Performace. Preprints. https://doi.org/10.20944/preprints202410.0683.v1
Chicago/Turabian Style
Cristea, D., Ioan Sima and László-Barna Iantovics. 2024 "How Good Perform Logistic Regression Algorithm for Complex Gastroenterological Image Analysis. Comparative Analysis with Physicians Performace" Preprints. https://doi.org/10.20944/preprints202410.0683.v1
Abstract
Given the complexity and variability of gastrointestinal polyps, there is a growing need for more accurate machine-learning algorithms that can assist medical professionals in decision-making processes. In our study, we employed machine learning (ML) techniques to classify gastrointestinal polyps seen in colonoscopy images, that present a risk of colon cancer, affecting a large portion of the population. The dataset used in the research comprised 152 instances and included three types of lesions, totaling 76 polyps. The study consisted of applying Logistic Regression (LR) to classify gastrointestinal images. The principal motivation for choosing this algorithm consists in the fact that is preferred by many medical researchers. Another motivation consisted in the fact to compare the polyps classification accuracy of physicians and LR with optimized hyperparameters. The approached model performance was evaluated with the following metrics: Accuracy, Precision, Recall, F1_Score and Macro-average. We used the Multiclass Logistic Regression (mLR) classifier to classify polyps into hyperplastic (27.63%), serrated (19.74%), and adenoma lesions (52.63%). It should be noted that serrated polyps are hard to diagnose, even by physicians, because they have characteristics of both hyperplastic and adenoma ones. One of the objectives consisted in the study of the hyperparameters tunning of the Logistic Regression to find the best-fitted model. To check the influence of the data mix on the results obtained, for the two best models (LR with liblinear solver, L1 penalty and C = 0.01), it was performed a comprehensive statistical analysis. Based on an algorithm that admits obtaining accurate results. Such analyses are mistakenly applied in the scientific literature. The model with optimized hyperparameters achieved an accuracy score of 70.39%, indicating a reasonable level of prediction, surpassing both beginner and expert physicians and obtaining results similar to those obtained by Random Subspace and Random Forest. The model successfully classifies the hyperplastic and adenoma polyps but showed to have a more limited ability to predict the serrated ones.
Keywords
Medical Imaging; Gastrointestinal Polyps; Machine Learning; Smart Techniques in Healthcare; Classification problem; Logistic Regression algorithm; Multinomial classifier; Colorectal disease; Clinical Decision Support System; Medical Diagnosis
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.