6.1. Preliminary Discussion of the NDR Data
We utilize data from the NDR to assess and visualize the association between the variables
weight variability,
age group, and
gender among patients with T1D who experienced one of the 14 CVCs outlined in
Section 1. We classify their
weight variability using four quartiles: the first quartile includes patients with a CV of weight between 0 and 0.018 ( Q1: low), the second quartile includes those with a CV of weight between 0.018 and 0.028 ( Q2: moderately low), the third quartile includes those with a CV of weight between 0.028 and 0.043 ( Q3: middle-high) and the fourth quartile includes those with a CV of weight at least 0.043 ( Q4: high). The CV of weight variable is then cross-classified to create a three-way contingency table with respect to the four
age group quartiles ( age1 = 21–27 years; age2 = 27–40 years; age3 = 40–52 years; age4 = 52+ years), and
gender (males and females). Consequently, the data are summarized in 14 three-way contingency tables, with each table corresponding to a specific CVC.
Table 1 presents Pearson’s three-way chi-squared statistic for each of the 14 CVCs given in
Section 1, as well as their p-value with respect to 24
. Interestingly, Pearson’s three-way chi-squared statistic does not yield statistically significant results for all 14 CVCs. Rather, it only shows that there is a statistically significant association (at the 1% level of significance) for two CVCs: macroalbuminuria and retinopathy. Therefore, among the 14 CVCs, there is no statistically significant association between
weight variability,
age group, and
gender for any of the six cardiovascular diseases (stroke, myo, HFH, CABG, PCI, and PAD). The interested readers can find the related three-way tables in the supplementary material to this paper.
In the following sections, we discuss the partition and decomposition of Pearson’s chi-squared statistic for the CVCs macroalbuminuria and retinopathy. We focus exclusively on these two three-way contingency tables since it is only for these CVCs where there exists a statistically significant association between weight variability, age group, and gender.
6.2. Patients with TD1 and Macroalbuminuria
The main question we shall address here is:
To what extent is there an association between weight variability, age group, and gender for patients with T1D and macroalbuminuria?
Table 2 presents a cross-classification of
weight variability (row variable),
age group (column variable), and
gender (tube variable) among patients with T1D experiencing macroalbuminuria.
To assess the nature of the association between
weight variability,
age group, and
gender in detail, we partition Pearson’s three-way chi-squared statistic,
, first in accordance with equation (
3) so that three bivariate terms and a trivariate term are produced.
Table 3 summarizes the partition of this statistic for
Table 2. This includes the partition of the phi-squared into four terms, their percentage contribution to phi-squared, the corresponding degrees of freedom (
df), the p-value, and the relative size (
) of each partition term (which facilitates a comparison of the chi-squared values derived from different asymptotic chi-squared distributions). In
Table 3, looking at Pearson’s three-way statistic
and its p-value that is less than
(
df=24), we can conclude that there is clear evidence of a statistically significant association between the three variables of
Table 2.
Table 3 also shows that the most dominant source of association among the three variables is the association between
weight variability and
age group, denoted as
; it contributes to over 75% of
and has a p-value that is less than 0.001. While the p-value of
for
Table 1 shows there is a statistically significant association between the variables,
Table 3 reveals that not all three bivariate terms of the partition yield a statistically significant association between the variables. Specifically, only
demonstrates statistical significance. Conversely, the associations between
weight variability and
gender (
), and between
age group and
gender (
), are not statistically significant, nor is the trivariate term (
). This means that while not all sources of associations between the variables are statistically significant, at least one is. Also, while
is not statistically significant we can still identify categories, and joint categories, that do provide a statistically significant contribution to the overall association between the three variables. We now turn our attention to addressing this issue for
Table 2.
To identify which categories of
Table 2 provide a statistically significant contribution to the total association, we calculate the p-value for each quartile of the
weight variability variable. We also calculate the p-value of the interaction between the
age group and
gender variables. These p-values are determined using the approach described in Beh and Lombardo [
4]. Their methodology constructs 100
confidence ellipses for points in a low-dimensional plot generated from applying a CA to the data. The methodology also included ways to calculate excellent p-value approximations for each point when assessing their contribution to the association (or lack thereof) between categorical variables. These p-values are designed to reflect the statistical significance of the distance of a category’s point from the origin, thus determining the statistical significance of the category to the association structure between the categorical variables. They are calculated taking into account the chi-squared statistic, the principal coordinates, the inertia explained by each axis, and the marginal proportions of
.
Table 4 shows that all the categories of the
weight variability are statistically significant at the
level, except for Q2. However, only two categories of the
age group for males and females shows statistical significance at the 1% level; they are "age4" for the males and "age1" for the females.
To visualize the association reflected in the p-values of
Table 4, we consider a column-tube biplot with interactive coding of
age group (column variable) and
gender (tube variable). However since the
is statistically significant (see
Table 3), we also portray the row-column biplot that depicts the statistically significant association that exists between
weight variability and
age group (
).
Figure 1 shows these two biplots: on the left-side is the column-tube interactive biplot, and on the right-side is the row-column interactive biplot. The column-tube interactive biplot on the left of
Figure 1 depicts the three-way association where the
age group-
gender interaction is depicted using principal coordinates and
weight variability is depicted using standard coordinates. The row-column interactive biplot on the right of
Figure 1 portrays the three-way association where the
weight variability-
age group interaction is depicted using principal coordinates and
gender is depicted using standard coordinates. The origin of both plots represents the independence between the variables. The farther the interactive categories are from the origin, the stronger the association. The interactive principal coordinates provide an accurate depiction of their distance from the plot’s origin and from each other. Note that the distance of points that have standard coordinates cannot be appropriately evaluated. Only their angle can be considered: when this angle is acute, there is a strong positive interaction with the interactive pair of categories. When the angle is at 90 degrees, there is independence, and when the angle is obtuse, there is a strong negative interaction.
The quality of the interactive biplots in
Figure 1 is very good since they visually describe about 84% (for the column-tube interactive biplot) and 83% (for the row-column interactive biplot) of the total association. Note that the farthest interactive coordinates contribute most prominently to the association. In the column-tube biplot of
Figure 1, they are "age4male" which is positively related to Q1, and "age1female" which is positively related to Q4. We can also see from
Figure 1 that "age1male" is negatively related to Q1, while "age4female" is negatively related to Q4. In the row-column biplot of
Figure 1 we can see that "Q1age4" is positively related to males, while "Q4age1" is positively related to the females. Furthermore, "Q1age1" is negatively related to males, and "Q4age4" is negatively related to females
In summary, the biplots in
Figure 1 show that males with T1D and macroalbuminuria are significantly characterized with having "low"
weight variability and being in the "old"
age group (aged 52 years or older).
Figure 1 also shows that the females with T1D and macroalbuminuria are characterized as generally having "high"
weight variability and being in the "young"
age group (aged 21 – 27 years old).
6.3. Patients with TD1 and Retinopathy
The primary question we address here is:
What is the strength of the association among weight variability, age group, and gender in patients diagnosed with T1D and retinopathy?
Table 5 presents a cross-classification of
weight variability (row variable),
age group (column variable), and
gender (tube variable) among patients experiencing retinopathy which is the most prevalent microvascular complication experienced by T1D patients, often present during adolescence [
28].
Table 6 shows that Pearson’s three-way chi-squared statistic of
Table 5 (
df=24), so that there is a statistically significant association between the three variables (p-value
). To assess the nature of the association between
weight variability,
age group, and
gender in detail for patients with T1D and retinopathy, we examine the partition of Pearson’s three-way chi-squared statistic
in accordance with equation (
3). This partition is summarized in
Table 6 which indicates that the three indices of the bivariate association are statistically significant, albeit at different
levels. The
indicates a significant association between
weight variability and
age group since the p-value is less than 0.001.
The , which represents the association between weight variability and gender, shows that there is a gender difference among patients with T1D and retinopathy at the 5% level of significance, and the also shows a significant association at this level of significance between age group and gender. However, the trivariate term is not statistically significant. This implies that while not every association among the three variables is statistically significant, there is at least one that is.
To discern which categories significantly contribute to the association, we compute the approximate p-value [
4] for each category within the
weight variability and
age group variables of the males and females; see
Table 7.
Table 7 shows that all the categories of
weight variability have a p-value that is less than 0.001 showing that their contribution to the association that exists within
Table 5 is statistically significant. Additionally, three categories among the
age group for males show statistical significance: "age1Male", "age3Male", and "age4Male". However, for the females in the study, only "age1Female" is statistically significant.
Figure 2 illustrates the three-way association. Since the association between
age group and
gender is statistically significant (see
Term - JK in
Table 6) this figure only shows the column-tube interactive biplot.
Figure 2 shows that the quality of the column-tube interactive biplot is excellent since 88% of the total association is depicted. In
Figure 2, our attention is directed towards the farthest
age group -
gender points, which are most relevant for evaluating the association. Unlike what was observed for patients with T1D and macroalbuminuria, we can say that males experiencing T1D and retinopathy are mainly characterized by having "high"
weight variability and being in the "young"
age group ( "age1male"). We can also see in
Figure 2 that females with T1D and retinopathy are characterized as having a "low" level of
weight variability and being in the "old"
age group ("age4female"). This result is consistent with what Nyström et al. [
28] discussed about patients with T1D and retinopathy, observing that overweight/obesity was more common in males than in females.