Preprint Article Version 1 This version is not peer-reviewed

How Good Perform Logistic Regression Algorithm for Complex Gastroenterological Image Analysis. Comparative Analysis with Physicians Performace

Current address: University ’1 Decembrie 1918’ of Alba Iulia
These authors contributed equally to this work.
Version 1 : Received: 8 October 2024 / Approved: 9 October 2024 / Online: 9 October 2024 (10:47:27 CEST)

How to cite: Cristea, D.-M.; Sima, I.; Iantovics, L.-B. How Good Perform Logistic Regression Algorithm for Complex Gastroenterological Image Analysis. Comparative Analysis with Physicians Performace. Preprints 2024, 2024100683. https://doi.org/10.20944/preprints202410.0683.v1 Cristea, D.-M.; Sima, I.; Iantovics, L.-B. How Good Perform Logistic Regression Algorithm for Complex Gastroenterological Image Analysis. Comparative Analysis with Physicians Performace. Preprints 2024, 2024100683. https://doi.org/10.20944/preprints202410.0683.v1

Abstract

Given the complexity and variability of gastrointestinal polyps, there is a growing need for more accurate machine-learning algorithms that can assist medical professionals in decision-making processes. In our study, we employed machine learning (ML) techniques to classify gastrointestinal polyps seen in colonoscopy images, that present a risk of colon cancer, affecting a large portion of the population. The dataset used in the research comprised 152 instances and included three types of lesions, totaling 76 polyps. The study consisted of applying Logistic Regression (LR) to classify gastrointestinal images. The principal motivation for choosing this algorithm consists in the fact that is preferred by many medical researchers. Another motivation consisted in the fact to compare the polyps classification accuracy of physicians and LR with optimized hyperparameters. The approached model performance was evaluated with the following metrics: Accuracy, Precision, Recall, F1_Score and Macro-average. We used the Multiclass Logistic Regression (mLR) classifier to classify polyps into hyperplastic (27.63%), serrated (19.74%), and adenoma lesions (52.63%). It should be noted that serrated polyps are hard to diagnose, even by physicians, because they have characteristics of both hyperplastic and adenoma ones. One of the objectives consisted in the study of the hyperparameters tunning of the Logistic Regression to find the best-fitted model. To check the influence of the data mix on the results obtained, for the two best models (LR with liblinear solver, L1 penalty and C = 0.01), it was performed a comprehensive statistical analysis. Based on an algorithm that admits obtaining accurate results. Such analyses are mistakenly applied in the scientific literature. The model with optimized hyperparameters achieved an accuracy score of 70.39%, indicating a reasonable level of prediction, surpassing both beginner and expert physicians and obtaining results similar to those obtained by Random Subspace and Random Forest. The model successfully classifies the hyperplastic and adenoma polyps but showed to have a more limited ability to predict the serrated ones.

Keywords

Medical Imaging; Gastrointestinal Polyps; Machine Learning; Smart Techniques in Healthcare; Classification problem; Logistic Regression algorithm; Multinomial classifier; Colorectal disease; Clinical Decision Support System; Medical Diagnosis

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.