1. Introduction
According to the recent Global Burden of Disease Study, noncommunicable diseases (NCDs) are the primary cause of death, accounting for 74% of all annual mortality [
1]. Most of these deaths happen prematurely, before the age of 70. Cardiovascular diseases are the leading cause of mortality related to NCDs, resulting in the loss of 17.9 million lives annually. They are followed by cancer, type 2 diabetes (DM), and kidney disease, predominantly caused by DM [
1]. Modifiable health behaviors, such as tobacco use, physical inactivity, and poor diet, significantly contribute to an increase in modifiable metabolic risk factors, including hypertension, hyperglycemia, dyslipidemia, and obesity [
2]. The interplay of these risk factors increases the risk of NCD morbidity and mortality. With rising global NCD incidence rates, implementing nuanced approaches targeting metabolic risk factors may help with NCD prevention.
Limitations of Measured CRF in Healthcare and Public Health
CPET and GXT are the most precise methods for objectively measuring CRF to predict health outcomes. However, their practical application faces challenges that hinder its widespread use. These challenges include clinical guidelines, high costs, time requirements, and the necessity for specialized staff and equipment. Such obstacles make routine CRF assessment impractical in healthcare and community settings [
21]. These limitations are also apparent when conducting epidemiological investigations on metabolic health outcomes. About eleven unique cohorts, such as the Aerobic Center Longitudinal Study (ACLS), are available for longitudinal analyses, containing healthy adults and measuring CRF at baseline [
22].
In response to these limitations, there has been a growing emphasis on developing non-exercise estimate equations for CRF (eCRF). These equations use readily available data, such as self-reported physical activity levels, weight, and age, often found in electronic health records or collected through population health surveys. Recent reviews by Ross et al. and Wang et al. of eCRF equations have shown that these models yield moderate (R² = .60) to high correlations (R² = .80) with directly measured CRF among generally healthy adults [
6,
21]. Artero et al. conducted a pioneering study in 2014 on the predictability of eCRF concerning all-cause mortality and heart disease among Caucasian Americans, finding that low eCRF predicts health outcomes as effectively as low CRF [
23]. However, most equations were developed using samples of Caucasian populations, potentially limiting their applicability across different ethnicities. The 2019 overview by Wang et al. identified that no eCRF studies had been conducted on metabolic health outcomes [
21]. Since then, there has been a gradual rise in cohort studies utilizing eCRF to assess the incidence of metabolic health risks.
Given the recent increase in studies since Wang et al.'s 2019 review, the aim of this review is two-fold. First, synthesize the existing longitudinal research on the association between eCRF and metabolic risk factors in adult populations. Second, identify and discuss gaps in the current literature, highlighting areas for future research and practice.
4. Discussion
Studies on eCRF and metabolic health risks are scarce. To date, six cohort investigations have been published, providing evidence for the incidence of hypertension, hyperglycemia, and dyslipidemia. No studies have been conducted on eCRF and the incidence of obesity. This review provides emerging evidence for using eCRF as a prognostic indicator for metabolic health risk. Significant inverse and dose-response associations were repeatedly demonstrated between higher eCRF and lower risk of high blood pressure, blood sugar, and abnormal lipids. These findings are aligned with previous studies using measured CRF. Most CRF cohort studies have been limited to primarily male Caucasian populations [
12,
15]. However, the increased use of eCRF in population health data sets has begun to expand the evidence on age groups, females, ethnicities, and socioeconomic status.
The limitations identified across some of the eCRF cohort studies in this review include concerns about low sample size, measurement accuracy, confounders, covariates, and generalizability of findings. In 2019 Wang et al. provided a scoping review of more than twenty eCRF equations [
21]. At the time, five health eCRF outcome studies focused on mortality as the primary outcome. Since then, the literature has expanded to include metabolic health risk outcomes, as discussed in this review. The ACLS Jackson equation was most commonly used to calculate eCRF in five investigations [
9]. The Jackson equation uses self-reported physical activity as one of the equation parameters, initially validated using the ACLS physical activity index [
47]. Only the investigation by Patel et al. used the ACLS-validated scale [
26]. The FOS, TMJC, RCCS, and CHARLS studies used unvalidated domestically designed questionnaires and adapted the parameter into the equation. This adaption method likely resulted in misclassification of eCRF levels in some participants, thereby reducing the accuracy and reliability of findings. It is also important to point out that the self-reported physical activity status is prone to bias, leading to misclassification.
Other commonly cited issues are the homogeneous populations studied, often with high socioeconomic status or specific ethnic backgrounds, limiting the external validity of the results. The Wang et al. review also recommended choosing equations that share the same ethnicity and age group. While there are validated eCRF equations for people of Chinese ethnicity, the CHARLS, TMJC, and RCCS used the Caucasian-validated Jackson equation with promising findings aligned with CRF meta-analyses. Notably, most of the participants in the meta-analyses are Caucasian males [
22,
33].
Another caution when applying eCRF equations is using redundant covariates or confounders in multivariate analyses. For example, when BMI is a parameter in an eCRF equation and is used again as a covariate during analysis, it could lead to multicollinearity. Multicollinearity occurs when two or more predictor variables in a regression model are highly correlated, meaning that one can be linearly predicted from the others with a substantial degree of accuracy [
48]. This redundancy may inflate the variance of the coefficient estimates and the standard errors, making statistical tests less reliable, the model's predictions less precise, and leading to wider confidence intervals. Potential solutions include conducting variance inflation factor (VIF) analysis or transforming a continuous variable by categorizing the covariate or confounder (e.g., 1= BMI <30, 2= BMI ≥ 30) [
49]. More recently, the advancement of causal inference through causal machine learning may offer a solution for more accurately accounting for covariates and confounders. Unlike associative studies that incorporate confounding variables to enhance the accuracy of outcome predictions, causal machine-learning models meticulously seek to isolate and exclude the influence of these variables to assess the impact of the exposure variable directly [
50]. Furthermore, machine learning methods may be more beneficial when using large real-world data such as electronic health records.
With the current dearth of literature, there are ample opportunities for further study regarding eCRF, metabolic risks, NCDs, and a broad range of health outcomes. Potential areas for future research may include focusing on larger multiethnic cohorts and young adults and comparing other eCRF equations for their predictive capability. Also, more evidence across diverse ethnicities and women is needed. One drawback is that a limited number of population health data sets or electronic health records contain CRF or all the parameters (e.g., Jackson) needed to calculate eCRF [
36]. Different eCRF equations may need to be applied to access more extensive, heterogeneous cohorts over longer durations. As discussed by Wang et al., eCRF models that do not use self-reported physical activity as a parameter may be applied more broadly (e.g., electronic health records) [
21,
36].
From a metabolic health outcomes perspective, there are various potential cohort studies to consider. eCRF prediction of prehypertension, prediabetes, and borderline dyslipidemia would be helpful to inform primordial prevention initiatives. Studies focused on the incidence of obesity, metabolic syndrome, and NCDs would add significant value to the growing eCRF prediction literature. Prevalence studies for understanding the fitness level of a particular community, region, or company can help map the magnitude and distribution of low fitness and assist with public health planning. Lastly, conducting experimental intervention studies using change in eCRF would provide more evidence for the tool's validity.
A growing and essential area of eCRF research and primary care is net reclassification improvement (NRI) for risk estimation. NRI is a statistical approach that assesses the extent to which incorporating a new biomarker such as eCRF improves the classification accuracy of individuals into more appropriate risk categories [
5]. For example, physicians often use the Framingham Risk Score to make patient clinical decisions. To improve the accuracy of the 10-year CHD risk score, Gander et al. applied the Jackson eCRF [
51]. The study showed that adding the eCRF improves the overall accuracy of the Framingham Risk Score in Caucasian men for heart disease risk. Similar findings were also found in a nationally representative sample of Koreans and a southern Chinese population for CVD mortality and morbidity, respectively [
52,
53]. There are numerous other risk prediction tools (e.g., DM, CKD, dementia) where adding eCRF may add predictive value.
Future Directions
CRF has been stipulated as a vital sign by the American Heart Association, and eCRF has been proposed to be used regularly in primary care settings to identify patients with low fitness and provide brief counseling [
6,
54]. However, a recent meta-analysis and systematic review concluded that the effectiveness of this individualistic approach might not,
on its own, improve physical activity, a key determinant of fitness, to sustain beyond 6 to 12 months [
55,
56]. In agreement with this observation, the International Society for Physical Activity and Health states, "Searching for a single solution to increasing physical activity may have hampered progress in this field by encouraging focus on simple, often short-term, individual-level health outcomes, rather than complex, multiple, upstream, population-level actions and outcomes [
57]." Brief counseling may be more effective when meshed with the determinants of eCRF in an individual's environment. This method is in line with the conceptual framework of the determinants of CRF (
Figure 1) and stems from social ecological theory [
18]. Consequently, research has to be done on using this framework with eCRF.
Given the growing sophistication of technology, eCRF has the potential to be utilized as a
population health vital sign to help prevent metabolic health risks. Electronic health records can auto-populate eCRF for rapid access and review [
37]. Integrating geographic information systems with EHR-derived eCRF data can enhance the early identification and mapping of metabolic risk factors [
58]. This integration allows public health officials to visually pinpoint areas of low fitness, referred to as hot spots, and further leverage eCRF parameters to segment and target specific populations, such as unfit middle-aged male smokers. Machine learning and artificial intelligence can further augment this process, enabling sophisticated, actionable analyses to guide targeted interventions, potentially maximizing the impact of individual, community, and public health initiatives [
59].
The use of eCRF in healthcare and public health settings aligns with the International Society for Physical Activity and Health's Eight Investments That Work for Physical Activity [
57]. Both initiatives focus on accessibility to diverse physical activity, exercise, and sports opportunities, facilitating the implementation of strategies like active travel and urban design. Effective collaboration between healthcare systems and public health is crucial for navigating the complex social ecology of communities. Such partnerships are instrumental in planning and deploying a systems-based approach to reduce metabolic risks.