1. Introduction
The disjunction between the degree of brain pathology and its clinical manifestations gave rise to the term “cognitive reserve or brain reserve” [
1]. These constructs often mediate between the clinical manifestations and the extent of brain pathology [
2]. Cognitive reserve refers to “the adaptability (i.e., efficiency, capacity, flexibility) of cognitive processes that help to explain differential susceptibility of cognitive abilities or day-to-day function to brain aging, pathology, or insult” [
3]. It includes determinants such as educational and occupational attainments [
4,
5] and is suspected to comprise a functional network actively involved in diverse cognitive processes [
6]. Cognitive reserve has been investigated in many clinical populations with neurocognitive disorders, including Alzheimer’s disease [
7], Multiple Sclerosis [
8], Schizophrenia [
9], and HIV [
10]. The construct is vital for diagnostic purposes [
2] and is a promising avenue for preventing and intervening in cognitive decline [
11,
12,
13]. Some also promote it as a population-level intervention [
14]. Currently, a reliable and valid tool for estimating cognitive reserve in the Arabic language is absent, as opposed to other languages such as Italian [
15], Spanish [
16], Greek [
17], and English [
18].
There are five approaches to estimating cognitive reserve: (I) measurement of individual characteristics, (II) consideration of cumulative life experiences, (III) estimation of intellectual functioning, (IV) statistical modeling and calculations, and (V) derivation of brain network patterns via imaging [
2]. Each approach has its advantages and disadvantages - that will most likely change with time. For instance, implementing statistical methods (decomposing the variance of a specific cognitive skill) provides an operational measure of reserve that is quantitative, continuous, and specific to the individual. However, it remains not feasible for the clinician to apply such scores individually [
19]. Approach II, on the other hand, synthesizes numerous experiences relevant to the cognitive reserve construct and is currently feasible [
2,
20]. The instruments which adopt such an approach are the following: Cognitive Reserve Index Questionnaire (CRIq) [
15], Cognitive Reserve Questionnaire (CRQ) [
21], Cognitive Reserve Scale (CRS) [
16], Lifetime of Experience Questionnaire (LEQ) [
18], Retrospective Indigenous Childhood Enrichment scale (RICE) [
22], Premorbid Cognitive Abilities Scale (PCAS) [
23], and Cognitive Reserve Assessment Scale in Health (CRASH) [
24]. For a comprehensive review of these scales (except for CRASH), the reader is referred to Kartschmit et al. (2019). Although the authors did not draw a recommendation for one specific tool, they indicated that the CRIq is most extensively evaluated [
20]. Compared to other tools, the CRIq balances out between administration time, cognitive reserve dimensions covered, interview span, and psychometric properties [
15,
20].
We previously examined cognitive reserve in Lebanon using approach I (i.e., measurement of individual characteristics) [
25]. In a sample of 508 community-based older adults, we showed that high education, complex occupation attainment, and leisure activity significantly predicted better global cognitive function [
25]. The current study aims to analyze the psychometric properties of the Arabic version of the CRIq. Since cognitive reserve focuses on the idea that there are individual differences in the adaptability of functional brain processes that allow some people to cope better than others with age- and disease-related brain change [
1,
2,
3], we adopt an individual differences approach in validating the tool [
26,
27,
28]. We hypothesized the following: (I) the CRIq-Arabic activities would slightly differ from the original study, (II) the CRIq–Arabic would show good internal consistency properties, (III) the cognitive reserve domains (education, working activity, and leisure time) as measured by the CRIq - Arabic would converge with cognitive reserve as a latent construct, and (IV) that the cognitive reserve constructs stand out from other functional/cognitive processes and diverge from measures of cognitive functioning (such as fluid intelligence).
2. Materials and Methods
The cross-sectional observational study is part of a larger project to validate an Arabic version of the Brief International Cognitive Assessment for Multiple Sclerosis (BICAMS) [
29]. All participants provided written informed consent.
2.1. Sample
Individuals were recruited from the community. Two screening phases were run to ensure that the sample was healthy. In the first phase, participants were excluded if they were younger than 16 years and had a history of neurological disorders, traumatic brain injury, or psychiatric disorders (including alcohol and/or drug dependence). Individuals were also excluded if taking medications affecting cognitive function, such as antipsychotics or antidepressants. During the second screening phase, participants were excluded if they had symptoms of depression or cognitive difficulties. The former was assessed using the Hopkins Symptoms Checklist-25 (HSCL-25; 3.3 cut-off score) [
30,
31,
32], and the latter using the Montreal Cognitive Assessment (MoCA; cut-off scores: 26 for individuals < 60 years of age, and 24 for those ≥ 60 years of age) [
33,
34,
35].
2.2. Procedure and Data Collection
All tests were administered in a standardized manner, in a quiet room, using the Lebanese Arabic dialect. After the screening phases, CRIq was administered, followed by the Test of Nonverbal Intelligence, 3rd Edition (TONI-III) and then BICAMS.
We followed the WHO guidelines on translation and adaptation of instruments, translated the CRIq to Arabic, and had the tool back-translated by an experienced English instructor who teaches at a university level. The principal investigator reviewed the tool and adapted two questions based on cultural Lebanese factors. The first is that working as a nurse was considered professional employment. The second is that the fourth question in CRIq-Arabic leisure time included additional hobbies such as Backgammon and playing cards (common in the Lebanese culture).
The CRIq is a semi-structured interview that includes 20 items and demographic information. It takes approximately 15 minutes to be administered. The instrument examines the frequency and duration of the 3 sets of activities – education (years of formal and informal education), working activity (years and level of professional occupation), and leisure time (years of frequent attainment of various activities such as reading books). As such, the CRIq includes an index score and a score for each of its 3 domains. These scores are adjusted for age [
15].
To examine fluid intelligence, we used the Test of Nonverbal Intelligence, 3rd Edition [
36]. The TONI-III is a language-free intelligence test that includes 45 items to be solved (for each form). Ceiling/discontinuation rules apply to the test, and raw scores are normed to an American sample of 3451 individuals [
36].
The Arabic BICAMS includes three tests: the Symbol Digit Modalities Test (SDMT), the Brief Visuospatial Memory Test–Revised Edition (BVMT-R), and the Verbal Memory Arabic Test. [
37].
The SDMT examines processing speed. The oral version of the test was administered. Using a test form containing 9 symbol-digit pairings (key) and a pseudo-randomized sequence of symbols (stimuli), the examinee must respond by voicing the digit associated with each symbol as quickly as possible. A sequence of 10 symbols is first used for practice. Then, the participant is given 90 seconds to complete as many items as possible present in the form after the practice items. The score reflects a number of correct responses [
38].
For the BVMT-R, participants are exposed for 10 seconds to a matrix of six simple abstract designs followed by an unaided recall. The examinee is asked to reproduce the designs using paper and pencil as accurately as possible and to place the figures in their correct positions. In total, 3 such trials occurred for each participant. Each figure can receive a 0, 1, or 2 score based on accuracy and location scoring criteria. [
39]. The score of interest was the total of trials 1 to 3.
We recently developed and validated a Verbal Memory Arabic Test (VMAT), which substituted the California Verbal Learning Test-2nd Edition (CVLT-II) in our study (the former is part of the original BICAMS protocol). The VMAT was developed indigenously in Arabic using quantitative and qualitative methods. The instrument measures verbal learning, short-term memory, long-term memory, and recognition. Similar to other standardized verbal learning/memory tests, and in line with Benedict et al. (2012) recommendations, the examinee is presented with 15 words (List A) to be recalled freely across 5 trials and is then presented with another 15 words (List B) which serve as an interference trial. Following the recall of List B, the participant is required to recall List A with and without semantic cues. Following a 25-minute delay, the test-taker must recall List A with and without cues and then recognize the words from List A from an array of 45 words that include List A, List B, and additional distractors. Several scores can be derived from the VMAT [
40]. The VMAT variables used in this study were the total number of words recalled on trials 1 to 5, short delay-free recall, and long delay-free recall.
2.3. Statistical Analysis
Descriptives of demographic data, scores on cognitive measures, and CRIq-Arabic questions were derived. Specifically, for the CRIq-Arabic education domain, information on the average number of years for the two education items is presented. For the CRIq-Arabic working activity domain, percentages of types of working activity according to cognitive resources involved are listed. For the CRIq-Arabic leisure time domain, percentages of types of activities carried out during leisure time are indicated. Descriptive data for the 3 CRIq-Arabic domains is also derived according to each age group (young, middle-aged, and older adults).
CRIq-Arabic standardized scores were derived using information from regression models. Each CRIq-Arabic domain was subjected to this analysis with age as an independent variable. Assumptions such as homoscedasticity and independence of observations were checked as appropriate. None of the models violated the assumptions.
For reliability, internal consistency was examined using Cronbach’s α and split-half correlations between odd and even questions of the CRIq-Arabic.
Structural equation modeling (SEM) was used to examine convergent and discriminant validity. We built four models following the Salthouse et al. (2003) approach and based on previous literature [
27,
28] and data availability. The larger BICAMS project did not include explicit measures of executive functions such as mental flexibility. The cognitive domains present were based on the Cattell-Horn-Carroll Model of cognition [
41,
42]. Fluid reasoning (Gf) was reflected through TONI-III. Long-term memory encoding and retrieval (Glr) was reflected through VMAT’s scores on total trials 1 to 5, short delay-free, and long delay-free recalls, as well as the total score on BVMT-R for trials 1 to 3. Processing speed (Gs) was reflected through the SDMT score.
Gf: fluid intelligence
Glr: long-term memory encoding and retrieval
Gs: processing speed
CRIq – A: Cognitive Reserve Index questionnaire – Arabic
EDU: education
WA: Working activity
LT: leisure time
TONI-III: Test of Nonverbal Intelligence, 3rd Edition
to T5: Verbal Memory Arabic Test total trials 1 to 5
SDF: Verbal Memory Arabic Test short delay free
LDF: Verbal Memory Arabic Test long delay free
BVMT-R: Brief Visuospatial Memory Test–Revised Edition
SDMT: Symbol Digit Modalities Test
Model A includes only the CRIq-Arabic target variables (education, working activity, and leisure time) and the hypothesized construct (CRIq-Arabic).
Model B allows the hypothesized construct to be related to other constructs (Gf, Glr, and Gs). Model C allows the target variable to be related to other constructs if they result in an improved fit compared to Model B.
Lastly, Model D examines variance common to the hypothesized construct when relations of individual variables to other constructs are considered. Convergent validity is established through moderate to strong loadings on the hypothesized construct, as opposed to discriminant validity, which is established as weak relations between those variables [
26].
Full-information maximum likelihood estimation was used to deal with missing data. To achieve identifiability, parameters for CRIq-Arabic leisure time were fixed at one. Reliability was set at 0.8 for TONI-III and SDMT to correct for single indicator constructs. The values selected were based on literature that shows good psychometric properties of these tests [
38,
43].
The fit of the models was examined using several parameters, which are chi-square (X2), the critical ratio (X2/df), the root mean square error of approximation (RMSEA), and Bentler’s comparative fit index (CFI). For CFI, values closer to one indicated a good model fit. For other parameters, values closer to zero indicated a good model fit [
44]
Analyses were performed on SPSS version 25 and AMOS version 23. Results with p < 0.05 were set as significant, and two-tails were used.
3. Results
3.1. Participants
226 individuals between the ages of 18 and 80 years were approached. Of these, 23 were excluded during the first screening and 29 during the second screening. Within the latter, 25 individuals were screened out due to the presence of cognitive difficulties as assessed by the MoCA. The final sample included 174 participants with a mean age of 44.40 ± 18.37. Sample characteristics are found in
Table 1, and scores on cognitive measures are in
Table 2.
3.2. CRIq-Arabic Descriptions and Computations:
Descriptive data about the CRIq-Arabic activities performed is depicted in
Table 3.
The mean raw scores for the CR domains education, working activity, and leisure time were 17.75 ± 4.99, 71.76 ± 70.15, and 232.67 ± 164.95, respectively.
Figure 2 includes a scatter plot for each domain of raw scores by age.
All regression models, whereby age was a predictor, and each CR domain was a dependent variable, were significant (p < 0.001). For the CR domain education, the y-intercept was 21.86, and the slope was – 0.09. This yielded average age-corrected scores that ranged from 54.79 to 159.71, with a mean of 100 and an SD of 14.10.
For the domain working activity, the corrected scores ranged from 67.78 and 145.89 (100 ± 12.02) after a y-intercept of – 29.67 and a slope of 2.28.
For the domain leisure time, corrected scores ranged between 73.59 and 121.50 (100 ± 7.02), following a y-intercept of – 119.58 and a slope of 7.93.
The final CR index ranged from 47.39 to 149.52 (100 ± 15). An Excel file for automatic computations is available from the authors upon request.
3.3. Internal Consistency:
Measures of internal consistency showed good evidence of reliability. Cronbach’s α of the full scale was 0.88. The correlation between the odd and even forms was 0.80 for the split-half procedure, and the Spearman-Brown coefficient was 0.89.
3.4. Construct Validity:
A one-factor model comprised of education, working activity, and leisure time as target variables, and the cognitive reserve latent construct, measured by CRIq-Arabic, was run for Model A. The model's overall fit cannot be determined since there were no degrees of freedom. Nonetheless, the three CRIq-Arabic variables had significant moderate to solid loadings on the latent construct. Therefore, initial evidence for convergent validity is present.
Model B examines the latent construct in the context of three non-target constructs: Gf, Glr, and Gs. Correlations between the target and non-target variables were substantially less than 1 (ex. Gs = 0.37), thus showing initial evidence of discriminant validity. The model also continued to show that the variables hypothesized to represent cognitive reserve have convergent validity with moderate to high loadings. The model fit characteristics were adequate.
Two non-target constructs (i.e., Gf and Glr) can relate to two cognitive reserve variables in Model C. The model's overall fit did not change compared to Model B. Evidence of convergent validity continued to be present with moderate to high loadings (ex. Working activity = 0.64). For discriminant validity, correlations slightly increased but were still significantly less than 1 (ex. Gs = 0.48).
In Model D, each non-target construct is allowed to relate to each of the observed cognitive reserve variables simultaneously. Loadings became moderate, but they were low for education (0.35). From the non-target variables, only Gf loaded significantly with education; estimate = 0.27.
4. Discussion
This study examined the psychometric properties of an Arabic version of the CRIq. The world's fifth most spoken language is Arabic, and 23 countries have Arabic as their official language [
45]. Results of our study suggest that the performed activities are comparable to the original CRIq study and that the tool exhibits good construct validity and internal consistency. Integrating a valid cognitive reserve measure into the diagnostic formulation is critical [
2]. Some clinical considerations include – but are not limited to differences between individuals with high versus low reserve when clinical symptoms are demonstrated, cognitive help as a factor affecting the rate of decline, and the possibility of it being a factor that influences response to treatment [
2].
Education and the percentages of types of activities carried out within the working activity domain and leisure time domain were primarily comparable to Nucci et al. (2012). There were slight differences, however, in the distribution of a few types of activities, which is partly in concordance with our first hypothesis. This highlights the importance of relying on culturally appropriate data [
2,
46]. Specifically, level 1 working activity (i.e., low-skilled manual work) was higher in Nucci et al. (2012) as opposed to our sample, in which level 4 was the highest (i.e., professional occupation). Furthermore, the two leisure time activities that differed were ‘using new technologies’ and ‘managing one’s bank account.’ While both activities decreased with age, in our sample, they remained higher than the ones performed by Nucci et al. (2012). In addition to some apparent lifestyle differences between both samples, it should be noted that individuals in our sample were screened out in case of cognitive impairment, as assessed by the MoCA. Nucci et al. (2012) indicated their sample had no evident neurologic or psychiatric illness. However, it was unclear if individuals were screened based on cognitive functioning. Subsequently, given the difference in the distribution of work activity between both samples, the sample profile could explain part of the differences.
The measure of internal consistency (Cronbach’s α = 0.88) was good in our study and higher than Nucci et al. (2012; Cronbach’s α = 0.62). This supports one form of CRIq-Arabic reliability and fulfills our second hypothesis. Further studies are needed to establish other types of reliability of the Arabic version (e.g., test-retest).
Theoretically, examining cognitive reserve has several measurement challenges [
14,
47]. Endorsing these challenges, we resorted to validating the CRIq–Arabic using the approach followed by Salthouse et al. (2003) and applied by others in cognitive reserve [
27,
28]. Our models consistently showed that the cognitive reserve domains, education, working activity, and leisure time, as measured by the CRIq - Arabic, converge with the latent construct of cognitive reserve, fulfilling the third hypothesis. This result is not surprising given the ample evidence on the role of these variables in cognitive reserve [
4] and their utility in most cognitive reserve scales [
20]. For example, the LEQ examines specific (education, occupation) and non-specific mental activity (leisure time) for each lifespan (3 stages) [
18]. In addition to Nucci et al. (2012), this result supports one aspect of construct validity of the CRIq in general. To the best of our knowledge, only Nucci et al. (2012) previously examined the convergent validity of the scale [
20]. The result is likewise in concordance with our previous study in Lebanon, in which education, complex occupation attainment, and leisure activity significantly predicted better global cognitive function in a sample of older adults [
25].
Lastly, there was good support for the fourth hypothesis, which indicated that the cognitive reserve construct would diverge from measures of cognitive functioning (such as fluid intelligence). Ideally, as a first step to provide evidence of discriminant validity, the magnitude of the loadings on the cognitive reserve construct from the observed variables (in this study: education, working activity, leisure activity) should be the same or larger than the correlations among the target and non-target constructs (in this study: Gf, Glr, and Gs) [
26]. This was the case in models B and C. Results on model D, which is most stringent compared to models A, B, and C [
26], suggest that education is related to Gf. It is difficult to compare this result with other studies since non-examined Gf [
27,
28,
48], although Satz et al. (2011) implicate fluid cognitive ability in cognitive reserve. More comprehensive studies that utilize measures of fluidity (in addition to crystallized intelligence) can better inform cognitive reserve measurement. It should also be noted that while education is one of the valid proxy indicators of cognitive reserve, some suggest that literacy is a more sensitive indicator [
49,
50].
Our findings should be interpreted considering several limitations. The main limitation is that executive functions were not measured. Although the contribution of executive functions to unique variance in cognition is debated [
41,
51], this construct remains elusive in cognitive reserve [
26,
47,
48]. The second limitation is that the scale was not validated in a clinical sample. Indeed, Cosentino and Stern (2019) indicate that the concept of cognitive reserve only applies when considering variability in cognitive functioning (i.e., memory) in the face of changes in brain integrity (i.e., hippocampal volume). Future studies on CRIq–Arabic validation should be performed in clinical populations such as those with Multiple Sclerosis. Final limitations include the lack of Arabic normative data for most of the used scales and the utility of one measure for Gf, as well as Gs. Strengths of the study include the large sample size, comparable to other research on CRIq [
20], and Ikanga et al. (2017), who tested formative and reflective models in cognitive reserve. However, our study’s sample size is not comparable to Nucci et al. (2012) original study and Siedlecki et al. (2009). Another strength is the utility of theoretically driven models and applying rigorous construct validity testing.
This study validated one of the most used cognitive reserve scales in Arabic in a healthy Lebanese sample. Further validation studies in clinical samples are warranted for additional scale evaluation. The CRIq–Arabic can be valuable in enhancing neuropsychological clinical practice.