1. Introduction
Cancer is the second leading cause of death in the United States, after heart disease. Projections for 2024 estimated 2.0 million new cases and 611,720 cancer deaths [
1]. Encouragingly, medical advancements have increased survival rates among patients with cancer [
2,
3]. However, cancer survivors often deal with multiple short- and long-term side effects over the course of their cancer treatments [
2,
3]. These effects include physical (e.g., pain, neuropathy, functional limitations), gastrointestinal (GI), and mental health (e.g., depression, anxiety) concerns [2-4]. The prevalence and severity of these physical, GI, and mental health concerns can vary widely, impacting survivors’ health-related quality of life (HRQOL), treatment adherence, daily functioning, nutrition, and overall prognosis. Addressing these overall health concerns is essential for enhancing cancer survivors’ well-being in the long term.
Of note, GI symptoms often persist in cancer survivors even after completing treatment. These symptoms include nausea/vomiting, appetite loss, altered bowel movements (e.g., diarrhea or constipation), bloating, indigestion, heartburn, and abdominal pain [5-9]. GI symptoms rank as the most common chronic physical side effects of cancer treatments, after psychological distress and fatigue in cancer survivors with mixed cancer types [
10]. In 142 breast cancer survivors, the GI symptom cluster was the second most prevalent after chemotherapy [
11]. In 413 colorectal cancer survivors, 81% experienced persistent GI symptoms 8 years post-treatment [
8]. In a review of GI toxicity after radiotherapy in rectal cancer survivors, long-term GI toxicity continued for over 3 months and included diarrhea (35%), fecal incontinence (22%), abdominal gas (71%), and abdominal pain (13%) [
12].
GI side effects related to cancer treatments are prevalent in older adults cancer patients, impacting physical and social functioning and HRQOL [
13]. GI symptoms are significant concerns for older adults cancer patients, with the incidence of overall GI symptoms reported to be as high as 40% in cancer patients on standard-dose chemotherapy and 100% on high-dose chemotherapy [
14]. Several factors contribute to the increased prevalence of GI issues in this population. Firstly, the aging process causes clinically significant effects on oropharyngeal motility, upper-esophageal motility, colonic function, and GI immunity [
15]. Second, older adults often have comorbidities and long-term exposure to medications, alcohol, and tobacco that may exacerbate GI distress [
16]. Further, cancer treatments can induce accelerated aging in individuals with cancer [
17]. Mechanisms such as oxidative stress, inflammation, and mitochondrial dysfunction are implicated [
17]. This accelerated aging phenomenon can worsen existing GI health conditions. As such, GI health concerns may be associated with the aging process, and cancer survivors can be more vulnerable to these connections.
Recent studies highlight the increased significance of biological over chronological aging in cancer survivors’ physical and psychological well-being [18-20]. Of note, telomere length (TL), which shortens during cell division, is a validated measure of biological aging [
21,
22]. In individuals of the same chronological age, shorter TL is linked to accelerated biological aging and various health conditions in cancer survivors [
20,
21]. While the association of TL with survival and mortality is well-studied in cancer survivors [
22], its association with HRQOL, including GI health, requires further investigation [
18,
19]. Social determinants of health (SDOH) significantly impact the physical and mental health of cancer survivors. Factors such as race/ethnicity, socioeconomic status, education, and marital status play crucial roles in the health outcomes of cancer survivors [
23,
24]. Chronic stress associated with poor SDOH triggers systemic inflammation, exacerbating physical symptoms [
25,
26]. Moreover, there is a potential link between TL, SDOH, and inflammation [
27]. Poor SDOH status was associated with TL shortening due to chronic stress and inflammation in US adults living in the community [
27]. Therefore, SDOH and TL may be related to GI health in cancer survivors. Understanding this complex interplay could inform interventions to improve GI health in cancer survivors.
The classification of GI health conditions and identification of contributing factors are crucial steps in choosing and applying personalized interventions for cancer survivors [
28,
29]. Machine learning (ML) offers substantial advantages in cancer survivorship care, particularly in classification or prediction models [
30,
31]. Unlike traditional statistical methods, ML can handle small sample sizes and multiple variables with complex relationships by controlling covariates and multicollinearity. It excels at identifying intricate patterns, handling high-dimensional data, and adapting over time [
30,
31]. This capability is especially beneficial in cancer survivorship research, where the number of survivors for certain types of cancer might be limited, and the relationships among cancer treatments, and health outcomes can be complex [
28,
29]. While ML has been employed to develop predictive models for cancer diagnosis and survival [
32], its application to GI health conditions in cancer survivors remains relatively rare.
Therefore, by leveraging ML with high precision, we aimed to develop and validate an ML classification model of GI health conditions (better vs. worse), incorporating TL and SDOH indicators as our primary interests, and demographic and clinical characteristics including inflammatory markers as secondary interests. The current study is a pilot to explore and identify the significant features including biological aging markers (i.e., TL in our study), and SDOH indicators to classify GI health conditions in adult cancer survivors, not just limited to those over 65. This approach enhances the performance of ML classification models by increasing sample size and providing a comprehensive understanding of GI health across different age groups.
4. Discussion
This study is the first to develop and validate the ML classification models for GI health in adult cancer survivors using supervised ML approaches to account for multiple factors. Although we used cross-sectional data, the ML algorithms used in our study constructed classification models based on demographics and clinical characteristics including inflammatory markers, TL, and SDOH factors for GI health, with good (if 0.5 < AUC < 0.7) to moderate to high (if > 0.7) prediction accuracy [29-31,49]. We also identified the relative importance of features classifying GI health conditions, by demonstrating that TL and some SDOH features (e.g., economic status, lifestyles) significantly influence the outcome classification (Better vs. Worse GI health status). The ML models developed and validated in our study could inform personalized approaches to identify cancer survivors at high risk for long-term GI distress, and thus, provide tailored interventions that address unmet needs triggering GI distress in adult cancer survivors.
Despite various predictive ML models being used in cancer survivors such as cancer diagnosis risk predictions, cancer survival rates, or detection of psychological symptoms [51-54], few studies have applied ML algorithms to classify or predict GI health in cancer survivors. Previous research has identified risk factors for GI distress in cancer survivors [
7,
8], but these studies did not explore the associations of TL and SDOH with GI distress. Emerging evidence supports the impact of SDOH [
55,
56] and biological age [
56,
57] on symptom disparities and HRQOL in cancer survivors. Our study addresses this gap by demonstrating the feasibility of using ML approaches to classify GI health. Specifically, we explore how TL and SDOH factors contribute to GI health in cancer survivors, providing new knowledge in this area. The ML models can handle numerous features effectively, minimizing both Type I and Type II errors in multiple comparisons. This advantage is often not feasible in traditional statistical methods (e.g., regressions and univariate analyses). Furthermore, ML models predict or classify GI health conditions more accurately than traditional statistical methods by leveraging large datasets and complex relationships among multiple input features.
Our findings suggested that not all features contributed equally to classifying GI health conditions. TL was identified as the most influential factor in GI health, independent of chronological age, suggesting a potential role for biological aging in GI conditions. The results of our study identified the positive relationships between better GI health, younger age, and longer TL. Having an income higher than the poverty level and routine physical activity also significantly contributed to better GI health.
Telomeres, protective caps at the ends of chromosomes, play a crucial role in cellular aging [
39]. Beyond chronological age, short TL lengths are associated with cellular senescence, where cells lose their ability to divide and function properly. Furthermore, senescent cells release inflammatory molecules, contributing to chronic inflammation associated with GI disorders like inflammatory bowel disease, altered bowel patterns, abdominal pain, indigestion, bloating, nausea, and gastroenteritis [
19,
58]. Biological aging influences gut health by impairing the integrity of the intestinal barrier, affecting immune cell function, and impacting gut microbial diversity [
19,
58]. Further, our findings reveal that the biological age might better reflect the functional aging of the GI tract, compared to chronological age [
59]. Wang et al. [
19] similarly discovered that longer leukocyte TL was associated with better GI function in patients with functional GI disorders. Investigating the mechanisms responsible for the shorter leukocyte TL observed in these settings could provide insights into managing GI health beyond chronological age considerations in cancer survivors.
The poverty-income ratio (PIR), was the second most significant feature of GI health in our study, other SDOH variables—such as lower income levels and racial/ethnic minority groups—were also associated with worse GI health. Previous studies support our findings that socially and economically vulnerable populations are exposed to more chronic stress, which can influence accelerating aging and pro-inflammatory status in the body [
60,
61]. Furthermore, socially and economically vulnerable populations face challenges in accessing healthcare resources, including community health services, oncology care, and primary care providers (PCPs). Additionally, vulnerable populations are more likely to reside in unsafe environments and neighborhoods, which may contribute to housing and food insecurity [
62]. Collectively, all of the aforementioned risk factors can contribute to various forms of GI distress in cancer survivors [
61].
Cancer risk behaviors, including lifestyle choices, smoking, and alcohol consumption, have well-documented associations with physical and psychological symptoms and HRQOL in cancer survivors [
24,
63,
64]. However, limited research has explored the specific relationships between these risk behaviors and GI health in cancer survivorship. Our findings reveal that cancer risk behaviors play a significant role in GI health conditions. Although previous research has primarily focused on other aspects of survivorship such as HRQOL and psychological symptoms [
24,
63,
64], our study highlights the need to consider GI-specific factors. The identification of risk behaviors associated with GI health provides actionable insights for survivor care. Of note, food security was a more significant feature of GI health compared to self-reported diet quality as measured by HEI. This discrepancy in feature importance for GI health could be due to several reasons. First, self-reported diet quality may not fully capture nutrient intake or align with actual dietary behaviors [
65]. Some individuals may report high diet quality despite lacking essential nutrients [
65]. Second, food security is prevalent among cancer survivors in the U.S. (from 4% to 83.6%) and directly influences nutrient intake, beyond broader social determinants such as poverty, and health literacy [
66,
67]. Furthermore, food insecurity induces stress, which can exacerbate the risk of GI diagnoses including GI cancers and GI disorders by promoting impaired gut mobility, immune responses, and barrier function [
68,
69]. Access to diverse healthy foods ensures essential nutrients and greater microbiome diversity, which are vital for overall well-being including GI health [
68,
69].
Clinical Implications. ML plays a crucial role in classifying or predicting GI health, particularly for socially vulnerable cancer survivors. ML models analyze data from cancer survivors to pinpoint those at greater risk of GI distress. Once identified, targeted interventions can address their unmet needs, whether through pharmacological or non-pharmacological approaches. Integrating ML algorithms into platforms like mobile apps or websites (such as MyChart) is a practical approach. Furthermore, users can access personalized insights about their GI health, receive recommendations, and make informed decisions based on ML-driven risk classifications. ML models can help to further tailor interventions for high-risk groups by considering their specific social needs and vulnerabilities. For example, routine assessment of accelerated aging in cancer survivors could be essential for overall well-being and GI health. Addressing smoking cessation, promoting healthy lifestyles (healthy diet and physical activity), and minimizing alcohol consumption directly could also impact GI health and serve as an anti-aging strategy. Lastly, routine screening for socio-economic needs may contribute to optimal GI health in cancer survivors. For example, oncologists or PCPs can refer to nutritional education or food assistance programs. Increasing multidisciplinary collaboration with social workers, nutritionists, and community resources is warranted not only for overall HRQOL but also for GI health.
Strengths and Limitations. The strengths of our study lie in the inclusion of a variety of input data, specifically inflammatory markers, TL, and SDOH features. Additionally, our focus on GI health—an unexplored area in cancer survivorship—along with the application of ML models, contributes to the development of powerful classification models for GI health that consider both biological and social mechanisms. The findings of our study also reflect the importance of biological age in GI health conditions, applicable to all adult cancer survivors, not just older adults. Furthermore, our ML model was validated using an independent test dataset. Our study has several limitations. First, NHANES is a cross-sectional survey, which may limit the predictability of our ML model. To enhance predictability, longitudinal studies with predictors and GI health conditions measured at different time points are needed. Second, the usefulness of inflammatory markers (WBC and CRP) for classifying GI health remains unclear in our study. One possible reason is that mean WBC and CRP levels fell within the clinically normal range in our samples. Third, findings regarding prediction performance should be interpreted with caution due to the overall small sample size of the test dataset. Further studies with larger sample sizes are warranted to prevent model overfitting. Fourth, using a single question to ask about GI health may have limitations in fully capturing the GI health conditions. Furthermore, the roles of SDOH on the relationships between TL and GI healthy is unknown. Lastly, cancer-related clinical characteristics, such as cancer stages, years since diagnosis, and types of treatments, were not available in our samples, although they are potential covariates for our ML models related to GI health.
Author Contributions
Conceptualization, CH, XN, CB, MF, AN, DV; methodology, CH, XN, TF, DV ; validation, CH, XN, CB, MF, AN, FT, DV; formal analysis, CH, FT; investigation, CH, XN, CB, TF, DV; data curation, CH; writing—original draft preparation, CH, XN; writing—review and editing, CH, XN, CB, DV, MF, AN, FT; visualization, CH; supervision, XN, CB, DV. All authors have read and agreed to the published version of the manuscript.