1. Introduction
Population aging is a major problem worldwide, including in Korea. According to the future population projections of Statistics Korea, this country will become a super-aged society by 2025 [
1], and is the country with the fastest aging rate in the world [
2]. Population aging also affects economic growth due to a decrease in the productive workforce. One way to alleviate the decreasing productive workforce in order to maintain national competitiveness is to include youth under the age of 15 years and/or seniors over the age of 65 years in the workforce. Rather than sending young people under the age of 15 years, who are still immature physically, mentally, emotionally, and socially, into the labor market, retaining individuals older than 65 years who are physically healthy in the productive workforce, and protecting them against early death.
With increasing age, it becomes more important to stay as physically healthy as possible. One study showed that the cost of productivity loss due to premature death of the individuals aged 70 years or older was estimated at 4.5 trillion to 5.4 trillion (KRW) for men and 1.4 to 1.7 trillion for women (KRW) [
3]. All individuals inevitably age, both physically and mentally [
4]. Muscle mass decreases, the immune system weakens, and the function of all organs deteriorates[
5,
6]. Aging is not a disease per se, but as homeostasis and recovery ability decrease [
7], the risk of contracting diseases increases, and eventually leads to death. Korea provides free health checkups for transitional stages at the ages of 44 years and 66 years. The National Health Screening Program for Transitional Ages is a nationwide screening program that was started in 2007 and is supported free of charge to the individual by the government, to detect and manage chronic diseases and health risk factors at an early stage at ages when major health changes are likely to occur due to aging and lifestyle [
8]. If people know how much a change in a factor could affect a health outcome, due to this life transition-period health checkup, they may be able to discard unhealthy behaviors and adopt healthy behaviors themselves, allowing them to manage their own health.
Most previous studies on this topic have shown which factors have an effect on mortality or survival time through survival analysis [9-11] or logistic regression analysis [12-14]. These studies described their results in terms of “people with risk factors have [several times] the risk of death compared to those without them.” The description of the risk of death having doubled or tripled is abstract and does not have clear meaning to most people. We think that the most effective way to encourage self-management is to predict how long a person’s survival period will be, and to indicate how much the survival period will increase if certain factors are changed.
Therefore, the purpose of this study was to construct a model for predicting the survival time in community-dwelling older individuals by using a deep learning method, and to identify the level of influence on the survival period according to risk factors, so that older individuals can manage their own health.
2. Methods
2.1. Data and study design
We conducted a deep-learning neural network analysis to build a survival prediction model. To this end, the study made use of the Korean National Health Insurance Service (KNHIS) claims database. This database was constructed from a cohort of 510,000 people (8%) out of approximately 6.4 million older individuals, aged 60 years or older in January 2008, who were eligible for national health insurance and medical aid, from 2002 to 2019. These individuals were randomly stratified by sex, age, 10th percentile of insurance premium, and regional district [
15]. The data have undergone anonymization and de-identification for research use.
The subjects of this study in particular were older people who lived in a community and took part in the National Health Screening Program for Transitional Ages at the age of 66 years, and were followed up for up to 11 years, from January 1, 2009 to December 31, 2019. The National Health Screening Program for Transitional Ages was conducted from 2007, but as checkup items such as activities of daily living (ADL), fall, and urinary discomfort, which are required for calculating the frailty score, were only included from 2009, we targeted people who underwent the national health checkup at the age of 66 years from 2009 to 2019.
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of University of Sangji (1040782-221214-HR-15-108).
2.2. Study population
From 2009 to 2019, 258,000 older people received national health screening checkups at the age of 66 years. We calculated the frailty index as described in a previous study [
10]. In this study, older individuals who had less than 80% of the information required to calculate the frailty index score were defined as having insufficient data and were excluded from the study. We also used the subject exclusion criteria used in a preceding report: 68,303 were excluded because they had insufficient data to calculate the frailty score. Of the remaining 189,697, 180,235 (95.0%) survived from 2009 to 2019, and 9,462 (5.0%) died. Since it is not possible to censor data for those who survived during the follow-up period, as in survival analysis, we used the data of the 9,462 deceased as the final dataset for this study.
2.3. Variables
The dependent variable of this study was the survival duration in those who died during the follow-up period. Death was defined as all-cause mortality. We used the standard procedure for calculating the frailty index. Previous studies used 39 health deficit items in five health domains that were assessed during screening examination: medical history (15 items), biometric or laboratory measures (8 items), physical health (2 items), psychological health (8 items), disability (6 items), and chronic conditions (8 items: arthritis, asthma, chronic kidney disease, congestive heart failure, coronary artery disease, chronic obstructive pulmonary disease, type 2 diabetes, and cancer, excluding nonmelanoma skin cancer) [
10]. We modified this approach: all items other than the chronic conditions were used as in the previous study, while the scores for chronic conditions were calculated separately as the Charlson’s comorbidity index (CCI) [
16], except for the scores for the eight chronic conditions. In our study, the eight chronic conditions were scored as 1 point if the chronic disease was present and 0 points if not present. However, since cancer and diabetes may have different levels of influence on death or health outcomes. The CCI was calculated by weighting each chronic disease. For example diabetes is counted as 1 point, and diabetes with complication is counted as 2 points. Thus, whereas 39 items were used to calculate the frailty index in a previous study, as the proportion of existing health deficits (range 0–1.00, with higher scores indicating greater frailty), eight chronic condition items were excluded from the 39 items in our study, so that 31 items were used to calculate the proportion of existing health deficits (range 0–1.00). Covariates included sex, health insurance status (national health insurance, medical aid), income level (5 categories: quintiles, for which we used insurance premium as a proxy, level of disability (none, mild, severe), grade of long-term care benefit, and admission to an intensive care unit (ICU). The level of disability was determined by the National Pension Service committee, which consisted of two medical specialists and a social worker, and was based on clinical documentation (disability certificate, medical records, and test results) and video evaluation. The level of disability is divided into six grades, and if there is no disability grade, it is classified as “none.” Among those with a disability grade, grades 1 and 2 are classified as “severe,” and the remaining grades are classified as “mild.” Korea provides long-term care benefit to older individuals who have difficulty in taking care of themselves for more than 6 months due to old age or geriatric disease. The extent of long-term care benefit is divided into five grades (grades 1 to 5). The lower the grade, the more long-term care is needed. In principle, grades 1 to 2 indicate the state of being eligible for admission to a nursing facility, and grades 3 to 5 indicate the state of receiving long-term care services at home. Admission to an ICU was defined according to the experience of being admitted to an ICU during the year of transitional life checkup.
2.4. Statistical analysis
The SHAP algorithm was applied in this study to calculate how much influence each feature had on the predicted result, based on the Shapley value in game theory (
Figure 1). The value obtained by the predictive model indicates the importance of the feature. To obtain SHAP, we used tree-SHAP, which improved the calculation speed by using a tree model rather than the traditional kernel-SHAP. Among various tree algorithms, XGBoost has a boosting feature and has the advantage of faster learning speed than random forest.
In this paper, the AAA learning model was applied because the training dataset was very small, and a regression model was used rather than a classification model. In addition, all data were used as training and test data, without distinguishing between training data and test data. In the final stage, values such as 0.0, which were found to be meaningless in the SHAP value, were excluded because they did not affect the learning model.
The model learning time was 482.07 seconds. The SHAP value was calculated using the learned model and the shapviz library. This process consumed 65.31 seconds. The input variable (X) of the learning model utilized 12 features, excluding LIFE_TIME data from raw data. The output variable (Y) of the learning model had a value of 0 to 10.9 as LIFE_TIME (survival time), and Min–Max scaling was performed in the range of [0, 1] to facilitate model learning. Model learning was accomplished using logistic regression in XGBoost. The learning goal was repeated 1,000 times in the direction of reducing the root mean square error (RMSE). In this process, if performance did not improve more than 50 times, the process was terminated early. All other parameters maintained the default parameters of XGBoost 1.7.5.1.
3. Results
Table 1 shows the characteristics of our study's population. 6,326 (66.9%) of the 9,462 eligible individuals was men; 3,136(33.1%) was women. There were 2,436 (25.8%) people with a score of o on CCI and 7,026 (74.2%) people with a score of 1 point or more. The average frailty index was 0.1547.
Individual variables |
|
|
Sex, n (%) |
|
|
Male |
6,326 |
(66.9) |
Female |
3,136 |
(33.1) |
Income level, n (%) |
|
|
0 percentile (Medical Aid) |
943 |
(10.0) |
1~20 Percentile |
1,579 |
(16.7) |
21~40 Percentile |
1,219 |
(12.9) |
41~60 Percentile |
1,542 |
(16.3) |
61~80 Percentile |
2,115 |
(22.4) |
81~100 Percentile |
2,064 |
(21.8) |
Charlson’s comorbidity index,n (%)
|
|
|
0 |
2,436 |
(25.8) |
1 |
2,791 |
(29.5) |
2 |
1,891 |
(20.0) |
3 |
1,119 |
(11.8) |
4 |
541 |
(5.7) |
5 |
289 |
(3.1) |
6 |
220 |
(2.3) |
7 |
111 |
(1.2) |
8 |
48 |
(0.5) |
9 |
11 |
(0.1) |
10 or more |
5 |
(0.1) |
Frailty index, Mean (SD) |
0.1547 |
0.0824 |
Long-term care benefit grade, n (%) |
|
|
1st grade |
471 |
(5.0) |
2nd grade |
1,020 |
(10.8) |
3~5th grade |
70 |
(0.7) |
Out of grade |
463 |
(4.9) |
Those who have not applied for a grade |
7,438 |
(78.6) |
Disability grade, n (%) |
|
|
None |
7,384 |
(77.7) |
Severe |
872 |
(9.2) |
Mild |
1,236 |
(13.1) |
Combination of DM, HTN, and dyslipidemia, n (%) |
|
|
0. DM(+), HTN(-), dyslipidemia(-) |
5,660 |
(59.8) |
1. DM(-), HTN(+), dyslipidemia(-) |
2,385 |
(25.2) |
2. DM(-), HTN(-), dyslipidemia(+) |
41 |
(0.4) |
3. DM(+), HTN(+), dyslipidemia(-) |
778 |
(8.2) |
4. DM(+), HTN(-), dyslipidemia(+) |
42 |
(0.4) |
5. DM(-), HTN(+), dyslipidemia(+) |
528 |
(5.6) |
6. DM(+), HTN(+), dyslipidemia(+) |
12 |
(0.1) |
7. DM(-), HTN(-), dyslipidemia(-) |
16 |
(0.2) |
Smoking status, n (%) |
|
|
No smoking |
4,910 |
(52.0) |
Ex-smoking |
2,166 |
(22.9) |
Current smoking |
2,374 |
(25.1) |
Alcohol consumption habit, n (%) |
|
|
No drinking |
870 |
(9.2) |
2~3times per month |
1,299 |
(13.7) |
Once or twice per week |
454 |
(4.8) |
More than 3 times per week |
6,839 |
(72.3) |
Use of Intensive care unit, n (%) |
|
|
No |
9,232 |
(97.6) |
Yes |
230 |
(2.4) |
In the model learning process, the RMSE by iteration round is represented as a graph as shown in
Figure 2. The performance of the learning model was confirmed with RMSE = 0.1339508 and R
2 = 0.701071.
Figure 3 shows the predicted survival time (Yp) on the Y-axis and the actual survival time on the X-axis when the learning data was applied to the model. Each point is one input datum. The X value is the actual survival time (LIFE_TIME), and the Y value is the predicted LIFE_TIME. The blue line represents a Y = X graph, where the actual survival time was the same as the maximum predicted by the model. The green line represents the median value of YPred with the same Y, and the red line is the trend line drawn based on these median values. In the graph, the first LIFE_TIME period of 0.38 (4.1 years) or less tended to predict a survival time longer than the actual time. Above 0.38, the predicted survival time was shorter than the actual survival time.
Figure 4 shows the result of calculating the arithmetic average of the absolute SHAP values for each feature, after calculating the SHAP value through the additionally learned model.
Plotting the feature value and the SHAP value of each input datum as a point is shown in
Figure 5. The CCI, frailty index, and long-term care benefit grade had the greatest influence, while admission to an ICU had the least influence. Since the predicted value is expressed by adding the SHAP values of all features of the corresponding data, the output (survival period) of the model became smaller when the SHAP value was negative, and became larger when the SHAP value was positive. In the case of long-term care benefit grade, the larger the value, the more negative was the SHAP value and the lower was the survival prediction. In contrast, for CCI and the frailty index, the life expectancy decreased as the value of the feature increased, and the life expectancy increased as the value of the feature decreased.
When the CCI was 0, the SHAP value was about 0.05. As the CCI value increased, the SHAP value decreased. Thus, it can be seen in
Figure 6 that the predicted survival time became shorter as the CCI value increased.
4. Discussion
In this study, we investigated what risk factors affect survival time, and to what extent, in 66-year-olds during an 11-year follow-up period, using a deep learning method. The purpose of this study was to develop a survival prediction model to assist older people, at the age of 66 years, in recognizing modifiable risk factors, so that they can manage their own health, by predicting the effect of each risk factor on survival period. The degree of influence of each risk factor on the survival period was expressed as the SHAP value. The most influential factor during the 11-year survival period was CCI, which represents chronic diseases in older individuals. Long-term care benefit grade was the third most influential factor. Diabetes, hypertension, and dyslipidemia, which are common among chronic diseases, were treated as separate variables rather than including them in the CCI calculation. The SHAP value of the combination of diabetes, hypertension, and dyslipidemia was about half of that of the CCI. As such, diabetes, hypertension, and dyslipidemia are significant factors that affect survival time as much as other chronic diseases.
Chronic diseases and frailty are closely related [
17,
18]. Frailty has been described as the loss of ability to adapt to stress because of diminished functional reserves [
19]. Some studies propose that the presence of chronic diseases contributes to the onset of frailty [
20,
21]. In addition, many studies have found that various chronic diseases per se can cause frailty, which reduces body function, or that the side effects of medications used to treat chronic diseases can cause frailty [10,20-25]. Frailty and the deterioration of body functions can lead to death or serious disability. In particular, the SHAP value related to the effect of CCI on survival period indicated that the survival period decreased by 0 to 2 points according to the comorbid conditions. Since 1 or 2 chronic diseases did not reduce survival time, whereas 3 or more complex chronic diseases decreased survival time, it is important to prevent progression to multiple chronic diseases. We confirmed that the frailty index also reduced the survival period once it reached a value of 0.3 or more. In many preceding studies, a frailty index of 0.3 defined transition of the frailty level from moderate to severe. Consequently, it is also important to prevent the frailty level from progressing beyond the moderate level. We calculated the frailty index using 31 factors, some of which were modifiable factors. Efforts should be made to reduce the frailty index as much as possible by addressing these modifiable factors. Moreover, it is important to prevent complications caused by chronic diseases and to prevent further disability by managing chronic diseases well. For those who did not apply for long-term care benefit grade, the SHAP value of the survival period had a negative value. Our survival time prediction model was based on individuals who died during the follow-up period, and we found that survival time was positive among those with the grade that required the most long-term care services, whereas the survival time decreased in those who did not apply for a long-term care benefit grade. We consider that this was due to death occurring before these individuals applied for long-term care benefit grade.
Our research had significant limitations. First, using the data of 9,462 people, a survival period prediction model was built using a deep learning method, but the data of 9,462 people may be insufficient. Second, we did not consider other factors affecting survival that were not included in the claims database. In addition, a predictive model was implemented using characteristics of the individuals at the age of 66 years, but factors that could change after the age of 66 years, such as the frailty index, chronic disease status, smoking status, and drinking habits, could not be considered. In addition, the frailty index was calculated using 31 items, but these 31 items were also characteristics noted at the time of the life transition-period examination at the age of 66 years, and only data at one point in time were used, without reflecting factors that change daily or yearly.
Despite these limitations, our research offers some advantages. In many previous studies, a model was built to predict survival per se, utilizing data of people with specific diseases or using hospital data. However, the present study constructed a model for predicting the survival period, rather than death or survival, and targeted older individuals living in the community rather than patients with specific diseases. In addition, the observation period was 11 years, which was relatively longer than that of previous studies. Moreover, data such as clinical and biological examination results, and functional disability data necessary to calculate the frailty index were used.
5. Conclusions
The number of chronic diseases, the frailty index, long-term care benefit grade, and the combination of diabetes, hypertension, and dyslipidemia were found to be the factors that most affected survival time. As the number of chronic diseases and the frailty index increased, the survival period decreased. In particular, multi-comorbidity and above-average frailty reduced survival time. A program that can prevent premature death of older individuals should be implemented, using customized care for older people who have complex chronic diseases or who are frail above the average level. Older people should be encouraged to recognize the modifiable factors that affect their survival period and to manage their health in order to extend their expected lifespan.
Author Contributions
Conceptualization, K.H.C and K.M.K.; Methodology, K.M.K; Software, J.M.B.; Validation, K.H.C. and J.M.B.; Formal analysis, K.H.C.; Investigation, K.H.C.; Resources, K.H.C.; Data curation, K.H.C.; Writing—original draft preparation, K.H.C. and K.M.K.; Writing—review and editing,K.H.C.; Supervision, K.H.C.; Funding acquisition, K.H.C. All authors have read and agreed to the published version of the manuscript.
Funding
National Research Foundation of Korea(NRF) funded by the Ministry of Education(MOE) in 2023 (2022RIS-005).
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of University of Sangji (1040782-221214-HR-15-108).
Informed Consent Statement
Patient consent was waived due to the nature of the study and the use of de-identified database data.
Data Availability Statement
This data is provided only to researchers, and researchers can also access the data using a security key. According to the National Health Insurance Service’s data disclosure regulations, the data cannot be disclosed to anyone other than researchers.
Acknowledgments
This research was supported by "Regional Innovation Strategy (RIS)" through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(MOE) in 2023 (2022RIS-005).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Statistics Korea. Population Projection for Korea (2020~2070). 2021.
- Statistics Korea (Press release). Population status and prospects of the world and Korea reflecting future population projections in 2021. 2022.9.5.
- Gong YH, Jo MW. Cost estimation of productivity loss of elderly over 70 due to premature mortality reflecting elderly employment in an aged society. Korean Association of Health Technology Assessment. 2017;5(2):89~94.
- WHO. World Health Organization; Geneva: 2015. World report on ageing and health 2015.
- Wilson D, Jackson T, Sapey E, Lord JM. Frailty and sarcopenia: the potential role of an aged immune system. Ageing Res Rev. 2017; 36: 1–10. https://doi.org/10.1016/j.arr.2017.01.006. [CrossRef]
- Barbat-Artigas S, Pion CH, Leduc-Gaudet JP, Rolland Y, Aubertin-Leheudre M. Exploring the role of muscle mass, obesity, and age in the relationship between muscle quality and physical function. J Am Med Dir Assoc. 2014;15:303.e13–303.e20. [CrossRef]
- Rattan, SI. Rattan SI. Aging is not a disease: implications for intervention. Aging Dis. 2014; 5(3):196-202. [CrossRef]
- Kim HS, Shin DW, Lee WC, Kim YT, Cho B. National screening program for transitional ages in Korea: a new screening for strengthening primary prevention and follow-up care.J Korean Med Sci. 2012;27(suppl):S70-S75. [CrossRef]
- Pedone C, Scarlata S, Forastiere F, Bellia V, Antonelli Incalzi R. BODE index or geriatric multidimensional assessment for the prediction of very-long-term mortality in elderly patients with chronic obstructive pulmonary disease? a prospective cohort study. Age Ageing. 2014 Jul;43(4):553-8. [CrossRef]
- Jang I, Jung H, Shin J, Kim DH. Assessment of fraity index at 66 years of age and association with age-related diseases, disability, and death over 10 years in Korea. JAMA Netw Open. 2323;6(3): e2248995. [CrossRef]
- Suemoto CK, Ueda P, Beltrán-Sánchez H, Lebrão ML, Duarte YA, Wong R, et. al.Development and Validation of a 10-Year Mortality Prediction Model: Meta-Analysis of Individual Participant Data From Five Cohorts of Older Adults in Developed and Developing Countries. J Gerontol A Biol Sci Med Sci. 2017 Mar 1;72(3):410-416. [CrossRef]
- van de Vorst IE, Golüke NMS, Vaartjes I, Bots ML, Koek HL. A prediction model for one- and three-year mortality in dementia: results from a nationwide hospital-based cohort of 50,993 patients in the Netherlands. Age Ageing. 2020 Apr 27;49(3):361-367. [CrossRef]
- Zhang Z, Xie D, Kurichi JE, Streim J, Zhang G, Stineman MG.Mortality predictive indexes for the community-dwelling elderly US population. J Gen Intern Med. 2012 Aug;27(8):901-10. [CrossRef]
- Tirandi A, Arboscello E, Ministrini S, Liberale L, Bonaventura A, Vecchié A. et al. Early sclerostin assessment in frail elderly patients with sepsis: insights on short- and long-term mortality prediction. Intern Emerg Med. 2023 Aug;18(5):1509-1519. [CrossRef]
- National Health Insurance Sharing Service. Elderly Cohort. Available at: National Health Insurance Bigdata (nhis.or.kr) Accessed by: 04.25.23.
- Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373-83. [CrossRef]
- Lee H, Lee E, Jang IY. Frailty and Comprehensive Geriatric Assessment. J Korean Med Sci. 2020 Jan 20;35(3):e16. [CrossRef]
- Zazzara MB, Vetrano DL, Carfì A, Onder G. Frailty and chronic disease. Panminerva Med. 2019 Dec;61(4):486-492. [CrossRef]
- Xue, QL. Xue QL. The frailty syndrome: definition and natural history. Clin Geriatr Med. 2011 Feb;27(1):1-15. [CrossRef]
- Acquarone E, Monacelli F, Borghi R, Nencioni A, Odetti P. Resistin: A reappraisal. Mech Ageing Dev. 2019 Mar;178:46-63. [CrossRef]
- Capurso, C. Capurso C. Increasing Our Understanding of How Dietary Components Can Affect Cellular Mechanisms That Regulate Aging and Slow the Onset of Frailty and Chronic Diseases. Nutrients. 2023 Jun 9;15(12):2687. [CrossRef]
- Sugihara T, Harigai M. Targeting Low Disease Activity in Elderly-Onset Rheumatoid Arthritis: Current and Future Roles of Biological Disease-Modifying Antirheumatic Drugs. Drugs Aging. 2016 Feb;33(2):97-107. [CrossRef]
- Fries, JF. Fries JF. Frailty, heart disease, and stroke: the Compression of Morbidity paradigm. Am J Prev Med. 2005 Dec;29(5 Suppl 1):164-8. [CrossRef]
- Marengoni A, Vetrano DL, Manes-Gravina E, Bernabei R, Onder G, Palmer K. The Relationship Between COPD and Frailty: A Systematic Review and Meta-Analysis of Observational Studies. Chest. 2018 Jul;154(1):21-40. [CrossRef]
- Veronese N, Cereda E, Stubbs B, Solmi M, Luchini C, Manzato E et al. Risk of cardiovascular disease morbidity and mortality in frail and pre-frail older adults: Results from a meta-analysis and exploratory meta-regression analysis. Ageing Res Rev. 2017 May;35:63-73. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).