Preprint Article Version 1 This version is not peer-reviewed

Evaluating Diabetes Risk: Bayesian Hierarchical Models and Machine Learning Integration

Version 1 : Received: 1 August 2024 / Approved: 2 August 2024 / Online: 2 August 2024 (16:41:24 CEST)

How to cite: Muhammad Khan, N.; Rahman, M. M.; Chowdhury, M. H. Evaluating Diabetes Risk: Bayesian Hierarchical Models and Machine Learning Integration. Preprints 2024, 2024080182. https://doi.org/10.20944/preprints202408.0182.v1 Muhammad Khan, N.; Rahman, M. M.; Chowdhury, M. H. Evaluating Diabetes Risk: Bayesian Hierarchical Models and Machine Learning Integration. Preprints 2024, 2024080182. https://doi.org/10.20944/preprints202408.0182.v1

Abstract

Type 2 diabetes mellitus (T2DM) is a global health concern driven by factors such as obesity, sedentary behavior, and poor diet. This study uses data from the 2017-18 Bangladesh Demographic and Health Survey (BDHS) to analyze regional and individual predictors of diabetes and predi-abetes. Employing a Bayesian multinomial mixed-effects model, we account for regional varia-bility and individual factors like age, gender, BMI, residence, wealth, education, employment, and hypertension. Our results indicate significant regional differences and associations between de-mographic and health-related factors with diabetes risk. Younger individuals and those with higher BMI are more likely to be diabetic, while hypertension significantly increases diabetes risk. We applied machine learning (ML) models, including logistic regression, decision trees, k-nearest neighbor, linear discriminant analysis, and random forest, to classify diabetic status using these predictors, assessing their accuracy through 10-fold cross-validation. Logistic regression and linear discriminant analysis demonstrated robust performance across various response distributions. Simulation studies further examined the impact of different response distributions on model performance, revealing significant differences in classification accuracies. This approach of esti-mating parameters with a Bayesian model, applying ML for prediction, and conducting simulation studies to explore various scenarios highlights the importance of integrating these methodologies for effective diabetes prediction, providing insights for public health strategies to mitigate T2DM's impact.

Keywords

Bayesian statistics; machine learning; mixed effect model; simulation; diabetes prediction.

Subject

Public Health and Healthcare, Public Health and Health Services

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.