Preprint Article Version 1 This version is not peer-reviewed

A Machine Learning Classification Model for Gastrointestinal Health in Cancer Survivors: Roles of Telomere Length and Social Determinants of Health

Version 1 : Received: 27 September 2024 / Approved: 27 September 2024 / Online: 30 September 2024 (09:22:12 CEST)

How to cite: Han, C. J.; Ning, X.; Burd, C. E.; Tounkara, F.; Kalady, M. F.; Noonan, A. M.; Von Ah, D. A Machine Learning Classification Model for Gastrointestinal Health in Cancer Survivors: Roles of Telomere Length and Social Determinants of Health. Preprints 2024, 2024092272. https://doi.org/10.20944/preprints202409.2272.v1 Han, C. J.; Ning, X.; Burd, C. E.; Tounkara, F.; Kalady, M. F.; Noonan, A. M.; Von Ah, D. A Machine Learning Classification Model for Gastrointestinal Health in Cancer Survivors: Roles of Telomere Length and Social Determinants of Health. Preprints 2024, 2024092272. https://doi.org/10.20944/preprints202409.2272.v1

Abstract

Background. Gastrointestinal (GI) distress is prevalent and often persistent among cancer survivors, impacting their quality of life, nutrition, daily function, and mortality. GI health screening is important to prevent and manage this distress. However, accurate classification methods for GI health remain unexplored. We aimed to develop machine learning (ML) models to classify GI health status (better vs. worse) by incorporating biological aging and social determinants of health (SDOH) indicators in cancer survivors. Methods. We included 645 adult cancer survivors from the 1999-2002 NHANES survey. Using training and test datasets, we employed six ML models to classify GI health conditions (better vs. worse). These models incorporated leukocyte telomere length (TL), SDOH, and demographic/clinical data. Results. Among the ML models, the random forest (RF) performed the best, achieving a high area under the curve (AUC = 0.98) in the training dataset. The gradient boosting machine (GBM) demonstrated excellent classification performance with a high AUC (0.80) in the test dataset. TL, several socio-economic factors, cancer risk behaviors (including lifestyle choices), and inflammatory markers were associated with GI health. The most significant input features for better GI health in our ML models were longer TL and an annual household income above the poverty level, followed by routine physical activity, low white blood cell counts, and food security. Conclusions. Our findings provide valuable insights into classifying and identifying risk factors related to GI health, including biological aging and SDOH indicators. To enhance model predictability, further longitudinal studies and external clinical validations are necessary.

Keywords

cancer survivors, gastrointestinal health, telomere, social determinants of health, machine learning.

Subject

Public Health and Healthcare, Primary Health Care

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.