PreprintArticleVersion 1This version is not peer-reviewed
Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches
Version 1
: Received: 28 October 2024 / Approved: 29 October 2024 / Online: 30 October 2024 (10:36:52 CET)
How to cite:
Faye, L. M.; Magwaza, C.; Dlatu, N.; Apalata, T. Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches. Preprints2024, 2024102346. https://doi.org/10.20944/preprints202410.2346.v1
Faye, L. M.; Magwaza, C.; Dlatu, N.; Apalata, T. Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches. Preprints 2024, 2024102346. https://doi.org/10.20944/preprints202410.2346.v1
Faye, L. M.; Magwaza, C.; Dlatu, N.; Apalata, T. Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches. Preprints2024, 2024102346. https://doi.org/10.20944/preprints202410.2346.v1
APA Style
Faye, L. M., Magwaza, C., Dlatu, N., & Apalata, T. (2024). Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches. Preprints. https://doi.org/10.20944/preprints202410.2346.v1
Chicago/Turabian Style
Faye, L. M., Ntandazo Dlatu and Teke Apalata. 2024 "Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches" Preprints. https://doi.org/10.20944/preprints202410.2346.v1
Abstract
Latent Tuberculosis Infection (LTBI) poses a significant public health challenge, especially in populations with high HIV prevalence and limited healthcare access. Early detection and targeted interventions are essential to prevent the progression of active tuberculosis. This study develops predictive models for LTBI outcomes using logistic regression and machine learning approaches and evaluates strategies to improve LTBI awareness and testing. Data from rural areas in the Eastern Cape, South Africa, were analyzed to identify key demographic, health, and knowledge-related factors influencing LTBI outcomes. Logistic regression was employed to pre-dict LTBI positivity based on factors such as age, education, and HIV status. Machine learning models, including decision trees and random forests, were also applied to compare predictive accuracy. A knowledge diffusion model was used to assess the impact of educational interventions on increasing LTBI awareness and testing rates. Logistic regression achieved an accuracy of 66.67% with high precision (80%) but low recall (33%) for LTBI-positive cases, identifying age, HIV status, and LTBI awareness as significant predictors. The random forest model outperformed logistic regression in accuracy (59.26%) and F1-score (0.63), providing a better balance between precision and recall. Feature importance analysis revealed that age, occupation, and knowledge of LTBI symptoms were the most critical factors across both models. The knowledge diffusion model demonstrated that targeted interventions significantly increased LTBI awareness and testing, particularly in high-risk groups. While logistic regression offers more interpretable results for public health interventions, machine learning models like random forests provide enhanced predictive power by capturing complex relationships between demographics and health factors. These findings highlight the need for targeted educational campaigns and increased LTBI testing in high-risk populations, particularly those with limited awareness of LTBI symptoms.
Keywords
Latent Tuberculosis Infection; Logistic Regression; Machine Learning; Random Forest; Public Health; LTBI Awareness; Predictive Modeling
Subject
Public Health and Healthcare, Public Health and Health Services
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.