1. Introduction
Both colorectal neoplasia (CRN) (including adenomas and colorectal cancer [CRC]) and cardiovascular disease (CVD) are prevalent pathologies, resulting in high mortality rates [
1]. Several risk factors have been shown to be related to the development of CRN, which are also risk factors for CVD. The most well-established to date include age [
2], male sex [
3], smoking [
4], type 2 diabetes mellitus and metabolic syndrome [
5].
Because of the high prevalence of both CRC and CVD and the existence of similar risk factors, several studies have suggested that patients with established CVD have a higher risk of developing CRN [
6,
7]. Based on these studies, it would be reasonable to consider CRC screening in patients with established coronary artery disease, but this is not performed in routine clinical practice, probably due to the increased 10-year mortality [
8] and the potential increased risk of procedure-related complications. However, CRC screening might be justified in patients who have a higher risk of developing CVD, but have not yet developed it.
Within the current framework of personalized medicine, the establishment of predictive risk models for CRN should allow individualized screening strategies. In the last decade, several studies have proposed the use of cardiovascular risk (CVR) scores as tools to predict the risk of CRN [
9,
10,
11], however, many of them were conducted in Asian populations.
Nowadays, in most countries, CRC screening is carried out in selected patients based on patient's age and family or personal history of CRC [
12]. Nevertheless, several other predictive scores for advanced CRN based on genetic variants, environmental/lifestyle factors or cardiovascular risk factors (CVRF) have also been published [
13,
14] but their discriminatory capacity was moderate. Therefore, there is a need to improve current CRC screening options.
Trying to address this issue, the main objective of our study was to create and validate a predictive score for advanced CRN (including advanced adenoma and CRC) based on CVRF and other factors previously described as risk factors for advanced CRN.
2. Materials and Methods
2.1. Study Design and Population:
This study comprised two cross-sectional cohorts: a derivation cohort (used to create the new score) which consisted of 1049 caucasian patients who underwent a colonoscopy at the tertiary level Lozano Blesa Clinical Hospital in Zaragoza, between May 2010 and December 2014, and an external validation cohort of 308 caucasian patients. Patients from the validation cohort underwent a complete colonoscopy between July 2019 and March 2020 at Son Espases University Hospital (a tertiary level hospital in Palma de Mallorca, Spain). Patients were included at the time of colonoscopy. Inclusion criteria were: patients aged ≥18 years who underwent a colonoscopy due to 1) CRC screening in average risk individuals older than 50 years, 2) digestive symptoms or 3) non-hereditary familial CRC history. Exclusion criteria included: patient refusal to participate in the study, personal history of CRC or prior polypectomy, family or personal history of hereditary CRC (polyposis or non-polyposis), personal history of inflammatory bowel disease, incomplete colonoscopy (cecal intubation not achieved), poor preparation (Boston scale <6 points) and patients lacking essential demographic or clinical information for the study.
2.2. Ethical Considerations:
All subjects provided informed consent for the study, which was approved by the ethics committee of each participating hospital. The study was conducted in accordance with the ethical standards established in the 1964 Declaration of Helsinki and its subsequent amendments.
2.3. Study Variables:
Demographic, clinical and biochemical data were recorded as shown in
Table 1 (data from the six months prior to the colonoscopy). Laboratory parameters and weight measurements were obtained after an overnight fast of at least 8 hours. Blood pressure measurements were obtained on the day of colonoscopy or, if available, from primary care records in the 6 months prior to the procedure. Medication use was recorded through personalized interviews or electronic prescription verification. CRC family history was obtained through personalized interviews. Histological examination was conducted by expert pathologists specifically dedicated to the study of digestive pathology. CRN was classified according to the criteria of the World Health Organization [
15].
2.4. Definitions:
Alcohol and tobacco consumption: Current drinkers were defined as adults who consume >2 drinks/day (men) or more than 1 drink/day (women). According to the definition from the "National Center for Health Statistics, a current smoker was any adult who has smoked ≥ 100 cigarettes in their lifetime and currently smokes. A former smoker was defined as someone who has smoked ≥ 100 cigarettes but had quit smoking at the time of the survey.
The normality criteria for the risk factors were established according to the "European Guidelines on Cardiovascular Disease Prevention in Clinical Practice (2016 version)" [
16] and the standards of the Spanish Society of Atherosclerosis 2022 [
17].
Average risk population of CRC: Asymptomatic patients between 50 and 69 years of age with no personal or family history of adenomas or CRC.
Non-syndromic familial CRC: Patients with first-degree family history of CRC in whom hereditary syndromes (polyposis and non-polyposis) were excluded through clinical criteria [
12].
CRN was defined as any histologically confirmed adenocarcinoma or adenoma. Advanced CRN was defined as invasive CRC or adenoma ≥ 10 mm in diameter, high-grade dysplasia, significant villous component (> 20%) or any combination thereof.
2.5. Calculation of Cardiovascular Risk:
Among the derivation cohort of 1049 patients CVR was calculated using: Framingham-Wilson [
18], REGICOR (Girona Heart Registry) [
19], SCORE (Systematic Coronary Risk Estimation) for low risk countries [
20] and FRESCO (Spanish risk function of coronary and other cardiovascular events) [
21]. These are presented as continuous variables.
2.6. Statistical Analysis:
An initial descriptive analysis of all clinical variables was performed. Qualitative variables were expressed as frequencies and percentages and continuous variables as median with interquartile range (Q1 - Q3). Normality was assessed using the Shapiro-Wilk test. Differences between independent groups were evaluated using the Chi-square (χ2) test for qualitative variables and the Mann-Whitney or Kruskal-Wallis test for continuous variables.
A multivariate logistic regression model with 10-fold cross-validation was used to create a new predictive score for advanced CRN, adjusted for all variables retrieved in the study (sex, age, first-degree family history of CRC, SBP, DBP, total cholesterol, HDL[high-density lipoprotein]-cholesterol, body mass index [BMI], diabetes, smoking, and antihypertensive treatment). The logistic regression model is constructed by applying the logistic function to a linear combination of predictor variables, using the formula:
P (
Y =1)=,
where
Y is the binary dependent variable, Χ₁, Χ₂,…, Χ
n are the predictor variables, and β₀, β₁,…, β
n are the coefficients to be estimated through the optimization process. The correlation coefficients are shown in
Supplementary Table S1.
Once the coefficients β₀, β₁,…, βn have been estimated, this formula is used to calculate the probability of the binary dependent variable Y being equal to 1. The estimated β coefficients in the derivation population are the ones subsequently applied in the validation population.
The discriminative ability of each score to predict advanced CRN was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC) and 95% confidence intervals (CI). The DeLong test was used to test the statistical significance of the difference between the areas under two dependent ROC curves.
The significance level in the study was set at 0.05. The analyses were conducted using the R programming language v.3.5.3 (R Foundation for Statistical Computing, Vienna, Austria) [
22].
The results obtained were validated in a different external population.
4. Discussion
Both CRC and CVD are the leading causes of mortality and morbidity worldwide. Previous studies have shown a strong coexistence of CRN and CVD, probably due to shared risk factors (e.g., smoking, obesity and metabolic syndrome) and pathophysiological mechanisms (e.g., insulin resistance, chronic inflammation and oxidative stress) [
23].
CRC is often developed from precancerous lesions so it is also one of the most preventable and curable tumors when detected early. However, the implementation of colonoscopy-based screening can be limited for several reasons, including insufficient resources, low participant compliance or concerns about procedure-related complications. In this regard, risk prediction models could be useful in more accurately identifying high-risk individuals and implementing timely prevention measures, thereby improving their effectiveness, acceptance, and compliance.
Based on the the presence of shared risk factors between CRC and CVD, CVR scores could be useful not only for predicting individual CVR but also for predicting the risk of advanced CRN. Furthermore, creating a score that combines CVRF with other risk factors linked to CRC could be highly valuable in establishing more personalized screening measures.
In this study, we present a simple and valid score for predicting the risk of advanced CRN in a Southern European population, using age and sex, along with various CVRF and the presence of first-degree family history of CRC.
It should be mentioned that our risk score doesn’t fit for sigmoidoscopy screening as the primary endpoint of our model was advanced CRN located anywhere in the colorectum. Advanced CRN, not just CRC, was chosen for analysis because it has been suggested as the most appropriate target for endoscopy screening [
3,
14].
In our multivariate analysis, only age, sex, tobacco consumption, and diabetes were associated with the risk of advanced CRN. Contrary to what might be expected with the current available evidence, after adjusted analysis, family history did not behave as a risk factor for advanced CRN. It was stablished that first-degree relatives of CRC patients are at higher risk of developing CRN compared to the population without a family history although this has been questioned lately, at least for individuals with only 1 first degree relative [
24]. The risk also varies depending on the degree of kinship, age of the index case at diagnosis, number of affected family members, and sex [
25].
On the other hand, in a systematic review and meta-analysis of predictive scores for advanced CRN in an average-risk population [
26], age, sex, first-degree family history of CRC, BMI, and smoking were the most commonly included factors in the scores as they have been shown to be associated with the development of advanced CRN. Based on this, we decided to include them as well as risk factors associated in our multivariate analysis and other such as SBP, DBP, and cholesterol (total and HDL), widely used to calculate CVR scores.
To date, there is limited data on the potential role of CVR scores to predict CRN. Only Framingham and SCORE [
9,
10,
11,
27,
28] have been evaluated. However, other CVR scores used and adapted to the Spanish population, such as FRESCO or REGICOR, have not been evaluated. Precisely, in our cohort, FRESCO-Model B and SCORE obtained the best predictive capacity. Furthermore, there are few European studies that have evaluated CVRF along with family history of CRC to predict the risk of advanced CRN [
29]. While it was not the main objective of our study, the results and application of CVR scores should be interpreted with caution because they were applied to a population not specifically collected for this purpose, including young and elderly patients, patients with known coronary artery disease, chronic kidney disease, diabetics as well as patients with other pathologies that can modify CVR.
The predictive capacity for advanced CRN by the different CVR scores is not high (the best model obtained an AUC value of 0.57; 95% CI: 0.53-0.61). On the other hand, based on previous studies [
30,
31] the predictive capacity for CRC based on family history (adjusted for age, or age and sex respectively), showed an AUC ranging between 0.53-0.55. However, in our study, by combining the presence of family history with other CVRF, we were able to improve the predictive capacity (AUC value of 0.68; 95% CI: 0.64-0.73), and what is important, this improvement was maintained in the external validation cohort (AUC 0.67; 95% CI: 0.57-0.76).
This AUC value was slightly better than those reported in other studies, which use predictive models based on dietary habits and lifestyle, such as the
Betés score [
13] (AUC 0.65 for advanced CRN; 95% CI (0.61–0.68)) or the
Kaminski score [
29] with an AUC of 0.62 for advanced CRN (95% CI, 0.60–0.64). In the same line, our score also demonstrates a predictive capability better than previously developed scores based on more complex and costly genetic analyses [
30,
31,
32]. For example, Jeon
et al [
30] with a combined model of genetic, environmental, and lifestyle factors showed a discriminatory capacity of 0.63 for men and 0.62 for women. Ibáñez
et al [
31] reported a similar discriminatory capacity (0.63) using genetic, environmental risk factors and family history. Our research group reported an AUC of 0.66 with a genetic model combined with age and sex [
32]. Finally, between the studies with the highest predictive capacity, Cai
et al [
14] reported an AUC of 0.74 (95% CI: 0.70-0.78), using age, sex, smoking, diabetes mellitus, and the consumption of green vegetables, pickles, fried foods, and white meats. However, some these factors are prone to recall bias.
It should also be pointed out that the CRNAS score proposed in this study has been designed and validated combining a wide range of populations, including average risk (asymptomatic) individuals, symptomatic patients, or those with a family history of CRC who do not meet the criteria for hereditary CRC. Few studies have evaluated the risk of advanced CRN in symptomatic populations [
33].
The validation cohort differed in baseline characteristics from the derivation cohort, although there were no differences in terms of sex and the difference in median age is likely not clinically relevant.
This study has several strengths. The first of them is precisely its external validation in a population with baseline characteristics different from the derivation cohort which would increase the reliability of the results. Another aspect is that is easy to implement in a primary care setting since all variables are easy to collect and do not need neither genetic data nor dietary habits. When risk scores are used in clinical or community settings, the number of predictors should be as small as possible, risk factors should be easy to obtain and it should be a balance between the simplicity of the model and the prediction accuracy. Furthermore, it shows an improvement in predictive capability to detect advanced CRN as opposed to relying solely on CVRF or even those combining genetic and dietary habits, having a moderate accuracy to predict advanced CRN, in addition to being able to use some of its parameters to also calculate individual CVR.
However, our study also has limitations such as the typical of a cross-sectional study. Regarding the recording of medication use, tobacco and alcohol consumption, there may be biases in data collection and memory biases that could affect the results. Finally, the observed lack of association between CRC family history and advanced CRC, may be due to the fact that we didnt’t subanalyzed the degree of kinship, age of the index case at diagnosis, number of affected family members or sex.
In summary, this study shows that the combination of CVRFs and CRC-specific risk factors, improves the predictive capacity to identify patients at high risk of advanced CRN. This simple score is easy to calculate and could change the way CRC screening measures are applied, making them more individualized
, avoiding the performance of colonoscopies in low-risk patients and prioritizing it in high-risk patients.
In addition, it allows calculating individual cardiovascular risk, enabling the establishment of preventive measures for this condition, which in turn, could be used as primary prevention for CRC by controlling common risk factors. Finally, our study prompts the question on whether patients with CRN have an intrinsic raised CVR compared to those without or with other non-neoplastic lesions that could explain the findings obtained in previous studies [
34,
35]. These findings should be confirmed in further studies and in other non-caucasian populations before applying them to routine clinical practice.