Growth Curve Models  and Clustering Techniques for the Study of New Sugarcane Hybrids: An Integrated Approach with K-Means, K-Medoids, and DBSCAN

Carlos David Carretillo Moctezuma; María Guzmán Martínez; Flaviano Godínez Jaimes; José Concepción García Preciado; Ramón Reyes Carreto; José Torrones Salgado; Edgar Pérez Arriaga

doi:10.20944/preprints202409.0179.v1

Submitted:

02 September 2024

Posted:

03 September 2024

You are already at the latest version

Abstract

Sugarcane (Saccharum spp.) is a crop of great industrial and alimentary importance, essential for the production of numerous products. Given its significance, genetic improvement programs exist that involve a rigorous study process, from material selection to the development of new varieties, requiring at least seven selection phases. This study modeled the growth curve of the sucrose percentage (SP) in 33 hybrids and six control varieties (MEX 69-290, ITV 92-1424, CP 72-2086, COLMEX 94-8, COLMEX 95-27, RB 85-5113) during the plant and ratoon periods in the experimental fields of the Melchor Ocampo Sugar Mill, Jalisco, Mexico. For clustering the materials, k-means, k-medoids, and Density-based spatial clustering of applications with noise (DBSCAN) algorithms were used, considering four maturity types among the control varieties. The DBSCAN algorithm proved to be the most effective, as the means between groups were not statistically equal. The hybrids identified as candidates for subsequent phases due to their high SP were COSTA JAL, ATEMEX 99-48, ATEMEX 99-1, ATEMEX 99-61, MEX 70-486, MEX 80-1521, ITSAMEX 07-44814, ITSAMEX 06-6395, and ITSAMEX 07-1903. These results are crucial for improving the productivity and sustainability of the crop, with significant implications for the sugar industry.

Keywords:

Growth curves

;

clustering

;

varieties

;

sucrose percentage

Subject:

Biology and Life Sciences - Agricultural Science and Agronomy

1. Introduction

The development of new sugarcane varieties offers a solution to the genetic deterioration experienced by commercial varieties that have been cultivated for several years and tend to exhibit a loss in their genetic diversity. This deterioration becomes evident through a decrease in the field yield, which subsequently impacts the industrial use of sugarcane. The lack of new genotypes to replace the deteriorated commercial varieties is the primary factor affecting the yield [1,2].

To obtain new varieties of sugarcane, materials that respond optimally to the environmental conditions and the needs of sugar agro-industries are proposed. The materials must meet requirements such as resistance to pests, diseases, and prolonged drought periods, as well as good adaptation to soil types, low post-harvest deterioration, and reduced production costs. It is also necessary for these varieties to have a good juice quality and an optimal amount of sucrose concentration. These traits will define the new commercial varieties [3,4].

Developing an outstanding commercial sugarcane variety for a specific region takes approximately ten to twelve years of constant evaluation both in the field and in an industrial setting. This process begins with the production of the seedling and ends with the evaluation of its performance in the field. The main phases that new materials must go through to become a commercial variety include strain selection, furrow testing, multiplication I, multiplication II, multiplication III, plot establishment, adaptation, agro-industrial evaluation, and semi-commercial testing [3,5].

During the adaptability phase, the goal is to evaluate the new hybrids under different soil and climate conditions. In this stage, both agronomic and industrial quality variables are assessed. The main agronomic variables include the germination percentage, height, population density, and health status. On the other hand, industrial quality variables include the sucrose percentage (SP), degrees Brix, purity, fiber content, and reducing sugars. Destructive sampling is carried out during this phase to study the maturity curves of the genotypes [5,6].

In the agro-industrial phase, the most important variables include the polarization percentage (PP), which indicates the content of sucrose in grams in the sample [7], degrees Brix, purity, fiber content, juiciness, and hardness of the sugarcane rind [5]. Some researchers utilize the PP to study the maturity of the materials [8] since the apparent sucrose is equivalent to the PP [9].

The physiological or biological maturity of sugarcane is determined by the accumulation of sucrose over time [10]. This leads to what is known as the sucrose accumulation curve, which can help one to identify the maturity type of the sugarcane cultivar. This curve is of interest to the sugar industry for genotype selection and is studied starting in the material’s adaptability phase [5,11].

Growth curves in agricultural crops are useful for understanding a plant’s development over time within a specific ecosystem. Typically, studying these curves involves making repeated measurements over time, and such data are referred to as longitudinal data [12]. Since this type of data often exhibits temporal autocorrelation, it is essential to employ an appropriate statistical methodology to analyze these data.

The growth curve of the SP is modeled using linear models with mixed effects that capture the variability of genotypes over time and the variability between genotypes. These mixed-effects models are known as growth curve models and are suitable for studying long yield cycles in perennial plants [13].

Some researchers compared commercial and new hybrids using measurements over time of several variables such as the degrees Brix, the PP, and the accumulation of sucrose. Quadratic regression was used to determine the sucrose accumulation over time, identify the optimal harvesting time, and maximize the sugar yield from each cultivar [14].

In another study, sucrose accumulation curves for different sugarcane genotypes in the Tucumán region in Argentina were studied. The variable under study was the PP. For each genotype, a nonlinear regression model composed of two polynomials with a break point, also known as a segmented polynomial, was fitted. To cluster the materials, the authors used the estimated sucrose accumulation curve and applied hierarchical clustering and partitioning methods, specifically Unweighted Pair Group Method using Arithmetic averages (UPGMA) and k-means, respectively. They found that the sucrose accumulation curves of the genotypes are influenced by the environment [11].

Some studies have been conducted on sugarcane genotypes to determine the agronomic response of new cultivars. The variables assessed were the tons of sugarcane per hectare, the tons of PP per hectare, and the sugar content throughout the harvest period. Using analysis of variance, multivariate analysis, and linear regression, researchers determined the best agro-industrial response, the optimal maturity stage for these cultivars, and the classification of the maturity type into the categories early, middle, and late [8].

The materials studied in this research have not been examined beyond the adaptability phase; therefore, there is uncertainty regarding how they will perform in terms of SP compared to the established varieties. This study will enable us to comprehend the sucrose accumulation curve’s growth pattern for both the new materials and the control varieties, as well as variations within each group across different cycles. This information could provide valuable insights for genetic improvement and the selection of more productive sugarcane varieties.

The objective of this work is to study the sucrose accumulation curves of 33 hybrids (new materials) that are in the adaptability testing phase, together with six control varieties, i.e., commercial varieties MEX 69-290, ITV 92-1424, CP 72-2086, COLMEX 94-8, COLMEX 95-27 and RB 85-5113, and to form groups that are classified according to sucrose content in the cane, divided into early, early-intermediate, intermediate-late and late-ripening types [15]. It is important to note that these materials have not been studied beyond the adaptability phase; therefore, there is uncertainty about how they will behave in terms of the SP compared to the established varieties. However, the employed methodology will provide a more precise description of their behavior. This study will allow us to understand the growth curve of the SP for the new and control materials, as well as the variations in these materials from one cycle to another; this could provide relevant information for genetic improvement and the selection of more productive varieties.

2. Materials and Methods

A total of 39 materials (Table 1) from the experimental fields of the National Institute of Forestry, Agricultural and Livestock Research (INIFAP) were studied. They form two groups: the control materials and the new materials. The control group corresponds to six commercial varieties and the new materials correspond to 33 varieties of Mexican and foreign origin. The new materials are in the adaptability phase, and all the materials were evaluated in the plant and ratoon crop cycles under irrigation conditions in both seasons.

The experiment took place at Las Pilas experimental field (19;46;48.85 N, 104;13;57.65 W) and the Melchor Ocampo Sugar Mill (19;47;18.0 N, 104;14;24.3 W), which is located in the Zacapala area of Autlán de Navarro, Jalisco, México [6]. The experimental field has loamy-sandy soil, with an accumulated annual precipitation of 732 mm, an altitude of 860 m above sea level, and an average annual temperature of

23.9

°C. The experimental unit consisted of four rows, with each row measuring

15.0

m in length and

1.4

m in width. The useful plot area consisted of the two central rows, with 2 m removed from the ends.

The SP of the materials was measured through destructive sampling on four dates (Table 2), with an approximate difference of one month between the measurement dates.Seven months elapsed from planting to the first measurement date in the plant cycle, and six months from harvesting to the first measurement date in the ratoon cycle.

2.1. Growth Curve Model

The linear mixed model is given by

y = X β + Zu + ϵ,

(1)

where

y

is the

n \times 1

response vector;

X

and

Z

are design matrices of dimensions

n \times p

and

n \times q

, respectively;

β

is the fixed-effect parameter vector of dimensions

p \times 1

;

u

is the random-effects vector of dimensions

q \times 1

; and

ϵ

is the error vector of dimensions

n \times 1

. In Model (1),

X β

is the fixed component, and

Zu + ϵ

is the random component. It is assumed that

u

and

ϵ

are independent, that is,

C o v (u, ϵ) = 0

, and that they each have a multivariate normal distribution:

\begin{matrix} u & \sim N_{n} (0, D), \\ ϵ & \sim N_{n} (0, R), \end{matrix}

where

D

and

R

are the covariance matrices of the random-effects vector and the error vector, respectively. From these assumptions, we have

y \sim N_{n} (X β, V),

where

V = C o v (y) = {ZDZ}^{T} + R

. By using the Henderson equations [16] to estimate the Model (1) parameters, the best unbiased linear estimator for

β

and the best unbiased linear predictor for

u

are obtained:

\begin{matrix} \hat{β} & = {(X^{T} V^{- 1} X)}^{- 1} X^{T} V^{- 1} y, \\ \hat{u} & = {DZ}^{T} V^{- 1} (y - X \hat{β}) . \end{matrix}

Let

y_{i j}

be the j-th observation of the i-th material, with

j = 1, . . ., n_{i}

(number of measurements) and

i = 1, . . ., n

(number of materials). The growth curve model is given by

y_{i j} (t_{i j}) = β_{0} + β_{1} t_{i j} + u_{0 i} + u_{1 i} t_{i j} + ϵ_{i j} .

(2)

The fixed component of Model (2) is

β_{0} + β_{1} t_{i j}

and the random component is

u_{0 i} + u_{1 i} t_{i j} + ϵ_{i j}

. According to this model and considering four measurement dates, the design matrices for each material i are of dimensions

4 \times 2

and are given by

X_{i} = Z_{i} = (\begin{matrix} 1 & t_{i 1} \\ 1 & t_{i 2} \\ 1 & t_{i 3} \\ 1 & t_{i 4} \end{matrix}) .

The fixed-effects parameter vector, the random-effects vectors, and the error vectors are given by

β = (\begin{matrix} β_{0} \\ β_{1} \end{matrix}), u_{i} = (\begin{matrix} u_{0 i} \\ u_{1 i} \end{matrix}), ϵ_{i} = (\begin{matrix} ϵ_{i 1} \\ ϵ_{i 2} \\ ϵ_{i 3} \\ ϵ_{i 4} \end{matrix}) .

Then, the model for the i-th material is given by

y_{i} = X_{i} β + Z_{i} u_{i} + ϵ_{i}, i = 1, . . ., 39 .

(3)

The random effects

u_{i}

,

i = 1, . . ., 39

, are assumed to be independent and identically distributed, that is,

u_{i} \sim N (0, D)

, with

D = V a r (u_{i}) = (\begin{matrix} σ_{u_{0 i}}^{2} & σ_{u_{1 i} u_{0 i}} \\ σ_{u_{0 i} u_{1 i}} & σ_{u_{1 i}}^{2} \end{matrix}) .

There are several covariance structures for

D

that provide a more accurate representation of the variability and correlation between observations within a group or subject [17] that enables more flexible and appropriate model fitting to longitudinal data.

The error vectors

ϵ_{i}

,

i = 1, . . ., 39

, are also assumed to be independent and identically distributed, that is,

ϵ_{i} \sim N_{n_{i}} (0, R_{i})

, with

R_{i} = C o v (ϵ_{i}) = (\begin{matrix} V a r (ϵ_{i 1}) & C o v (ϵ_{i 1}, ϵ_{i 2}) & C o v (ϵ_{i 1}, ϵ_{i 3}) & C o v (ϵ_{i 1}, ϵ_{i 4}) \\ C o v (ϵ_{i 1}, ϵ_{i 2}) & V a r (ϵ_{i 2}) & C o v (ϵ_{i 2}, ϵ_{i 3}) & C o v (ϵ_{i 2}, ϵ_{i 4}) \\ C o v (ϵ_{i 1}, ϵ_{i 3}) & C o v (ϵ_{i 3}, ϵ_{i 2}) & V a r (ϵ_{i 3}) & C o v (ϵ_{i 3}, ϵ_{i 4}) \\ C o v (ϵ_{i 1}, ϵ_{i 4}) & C o v (ϵ_{i 4}, ϵ_{i 2}) & C o v (ϵ_{i 4}, ϵ_{i 3}) & V a r (ϵ_{i 4}) \end{matrix}) .

This matrix can be decomposed in the form

R_{i} = σ^{2} Λ_{i} C_{i} Λ_{i},

where

Λ_{i} = d i a g (λ_{1}, . . ., λ_{n_{i}})

is a diagonal matrix with nonnegative diagonal elements for the i-th material, with

λ_{j}

,

j = 1, . . ., n_{i}

, representing functions of variance [17];

C_{i}

is a correlation matrix for the i-th material. This way, the presence of heteroscedasticity in the elements of the vector

ϵ_{i}

is explained by

Λ_{i}

, and

C_{i}

explains the correlation between the observations within the group. Some structures for

C_{i}

include autoregressive processes of order 1 and autoregressive structures [18].

For the plant cycle, Model (3) was used. The varPower variance function was used, indicating that the magnitude of the variance error changes nonlinearly over time. The specification of

R_{i}

assumes that

C_{i} = I

, and the elements of

Λ_{i}

are defined by the varPower

(\cdot)

variance function; their dependence on time is given by

λ_{j} = {| t_{i j} |}^{δ}

, where

δ

is a constant. Then,

R_{i} = σ^{2} (\begin{matrix} t_{i 1}^{2 δ} & 0 & 0 & 0 \\ 0 & t_{i 2}^{2 δ} & 0 & 0 \\ 0 & 0 & t_{i 3}^{2 δ} & 0 \\ 0 & 0 & 0 & t_{i 4}^{2 δ} \end{matrix}) .

Model (3) was also used for the ratoon cycle, with

u_{1 i} = 0

\forall i

,

i = 1, 2, . . . ., 39

. The varIdent variance function was used; it is suitable when the data exhibit less variation within a group. For the specification of

R_{i}

,

C_{i} = I

was assumed, and the elements of

Λ_{i}

are defined by the varIdent variance function; their dependence on time is given by

λ_{j} = δ_{S_{i j}}

. We set

δ_{S_{i j}} = 1

for ∀i and j; then,

R_{i} = σ^{2} (\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}) .

For the plant cycle, the intraclass correlation coefficient

ρ_{I C C}

was calculated; it indicates the degree of correlation that exists between the observations of the same group. It is calculated using the following equation:

ρ_{I C C} = \frac{σ_{(u_{0}, u_{1})}}{σ_{u_{0}} σ_{u_{1}}},

(4)

where

σ_{(u_{0}, u_{1})}

is the covariance between the random effects of the intercept and slope, and

σ_{u_{0}}

and

σ_{u_{1}}

are the standard deviations of the random intercept and the random slope, respectively.

For the clustering analysis, two partition-based algorithms, k-means and k-medoids, were utilized [19]; and one density-based algorithm that is robust to outliers, DBSCAN [20].

The evaluation of the clusters identified was done using the silhouette index, the Dunn index, and the connectivity index.

The silhouette index measures the confidence with which an observation is assigned to a cluster, providing information about the cohesion and separation of the clusters. This index varies in the range from

- 1

to 1, where a value close to 1 indicates that the point has been correctly assigned to the cluster and is well separated from other clusters [21]. Values from

0.71

to 1, a strong structure has been found; from

0.51

to

0.70

, a reasonable structure; from

0.26

to

0.50

, the structure is weak; and ≤

0.25

, no substantial structure [22]. Negative values have similar interpretation and indicates that the object may be incorrectly assigned and is closer to points from another cluster [23]. The silhouette index is given by

I_{S} = \frac{1}{n} \sum_{k = 1}^{K} \sum_{i \in C_{k}} \frac{b (i) - a (i)}{max {a (i), b (i)}},

(5)

where n is the total number of objects in the dataset, K is the number of clusters,

C_{k}

is the set of objects in cluster k,

a (i)

is the average distance between object i and all other objects in the same cluster,

b (i)

is the average distance between object i and all objects in the nearest cluster, and

max {a (i), b (i)}

is the maximum value of

a (i)

and

b (i)

.

The Dunn index is a measure of the quality of a partition in terms of the ratio between the minimum distance between cluster centroids and the maximum distance between points and their respective centroids [24]. This index is given by

I_{D} = \frac{{min}_{k = 1 . . K} \{d (C_{k}, C_{0})\}}{{max}_{x_{i} \in Ω} \{d (x_{i}, c_{k})\}},

(6)

where K is the number of clusters,

{min}_{k = 1, . . ., K} \{d (C_{k}, C_{0})\}

is the minimum of the distances between the centroids of each cluster

C_{k}

and the global center

C_{0}

, and

{max}_{x_{i} \in Ω} \{d (x_{i}, c_{k})\}

is the maximum distance between each point

x_{i}

in the dataset

Ω

and its respective centroid

C_{k}

in its corresponding cluster. The minimum distance is a measure of the internal coherence of the clusters and the maximum distance is a measure of the separation between clusters.

The connectivity index is calculated by constructing a neighbors matrix. Define

n n_{i (j)}

as the j-th nearest neighbor of observation i, and let

x_{i, n n_{i (j)}}

be zero if i and

n n_{i (j)}

are in the same cluster and

1 / j

otherwise. The connectivity index has a value between 0 and ∞ and should be minimized [25]. This index is given by

I_{C} = \sum_{i = 1}^{n} \sum_{j = 1}^{L} x_{i, n n_{i (j)}},

(7)

where n is the total number of objects in the dataset, and L is a parameter that determines the number of neighbors that contribute to the connectivity measure.

3. Results

3.1. Plant Cycle

Table 3 presents the parameter estimates of the fitted model (Model 3), along with their respective 95% confidence interval (CI) for SP, showing both fixed and random effects. The fixed effects include the estimated parameters

{\hat{β}}_{0}

and

{\hat{β}}_{1}

, both significant, which represent the intercept and slope, respectively. The estimated intercept reflects the average SP value when time is zero, that is, the expected SP at the start of the measurement. The slope indicates the change in SP for each unit change in time; thus, a value of 1.50 signifies that, on average, SP increases by 1.50 units for each additional unit of time.

The fitted model for the plant cycle satisfies the assumptions of normality (Kolmogorov-Smirnov, p-value = 0.85), homogeneity of variance (Bartlett, p-value = 0.34), and independence (Box-Pierce, p-value = 0.23). The correlation coefficient (

ρ_{I C C}

) suggests a negative correlation, indicating a strong inverse relationship between the intercept and the slope. This means that when the intercept is high, the slope tends to decrease, and vice versa. This information is valuable for understanding how different materials will respond and evolve in terms of SP over time. The parameter estimates are presented below.

The estimated parameters for the 39 materials from Las Pilas are shown in Table 4. The variety COLMEX 94-8, which is known to have an Early maturity, has an estimated intercept of

8.97

and an estimated slope of

1.22

; similar values are observed for the variety RB 85-5113, which has an estimated intercept of

8.23

and an estimated slope of

1.23

. The variety MEX 69-290, which is known to have an Intermediate-Late maturity, has an estimated intercept of

5.67

and an estimated slope of

1.82

. From the estimations of all materials, it can be deduced that materials with an Early maturity tend to have a higher intercept. Since both the intercept and the slope are random, their values vary depending on the specific sugarcane material being evaluated. Thus, materials that have a Late maturity tend to have a smaller intercept and a larger slope.

For the commercial varieties, the maturity type (Early, Early-Intermediate, Intermediate-Late, and Late) is known for the soil and climate conditions of the region [26]. Meanwhile, for the new materials, the process of identifying the maturity type based on the climate and soil conditions being evaluated is being carried out. Taking this information into account, the materials are grouped.

3.1.1. Clustering

Clustering was performed using the intercept and slope estimates of the model. Since there are four maturity types among the control materials, the number of groups was set to four according to the maturity types of the commercial materials.

The clustering of the materials using the k-means algorithm with

k = 4

is shown in Figure 1. The variety MEX 69-290, is known to have an intermediate-Late maturity.and belongs to the group in orange which consists of 11 new materials so this group has an Intermediate-Late maturity. The Early group (green) contains ten new materials and The varieties ITV 92-1424, RB 85-5113, CP 72-2086, and COLMEX 95-27 are known to have an Early maturity and belong to the green group formed by other ten new materials, so this is the Early maturity group. The materials shown in brown are classified as having a Late maturity; this group comprises eight new materials. Finally, the Early-Intermediate Group (yellow) contains four new materials and the variety COLMEX 94-8, which is known to have an Early-Intermediate maturity.

The validation indexes for clustering the materials with k-means are

I_{S} = 0.63

,

I_{D} = 0.12

, and

I_{C} = 13.41

. The value of

I_{S} = 0.63

indicates that the grouping of materials shows moderately good cohesion and appropriate separation between the clusters. The value of

I_{D} = 0.12

indicates relatively low separation between clusters and a potential for improvement in cohesion within the clusters. Meanwhile, the value of

I_{C} = 13.41

indicates good separation between clusters and appropriate compactness within the clusters. Therefore, two out of three indexes indicate that the k-means clustering is good.

The clustering of the materials using the k-medoids algorithm with

k = 4

is shown in Figure 2. The yellow group contains seven materials classified as having an Early-Intermediate maturity. The green group consists of 12 materials, including the commercial varieties ITV 92-1424, RB 85-5113, CP 72-2086, and COLMEX 94-8, all of which are reported to have an Early maturity. The brown group contains eight materials with a Late maturity. Finally, the orange group includes 11 materials with an Intermediate-Late maturity, as it incorporates the variety MEX 69-290.

The validation indexes for clustering the materials with k-medoids clustering are

I_{S} = 0.55

,

I_{D} = 0.12

, and

I_{C} = 13.85

. The value of

I_{S} = 0.55

indicates that the clustering exhibits moderate cohesion and appropriate separation among the groups, but there is still room for improvement in terms of group separation. The value of

I_{D} = 0.12

indicates relatively low separation between clusters and a potential for improvement in cohesion within the clusters. The value of

I_{C} = 13.85

indicates good separation between the groups and appropriate compactness within the clusters.

The clustering of the materials using the DBSCAN algorithm with

k = 4

clusters is shown in Figure 3. The Early group (green) consists of 11 new materials and four commercial varieties (ITV 92-1434, RB 85-5113, CP 72-2086, and COLMEX 95-27). The Early-Intermediate group (yellow) includes four materials and the commercial variety COLMEX 94-8. The Intermediate-Late group (orange) comprises 16 new materials, along with the commercial variety MEX 69-290. Lastly, the Late group (brown) consists of two new materials.

The validation indexes for clustering the materials with DBSCAN are

I_{S} = 0.64

,

I_{D} = 0.24

, and

I_{C} = 12.36

. The values of

I_{S} = 0.64

and

I_{D} = 0.24

indicate that the clustering exhibits moderately good cohesion and adequate separation between the groups. The value of

I_{C} = 12.36

suggests reasonable connectivity between the groups, implying that there is internal connectedness within each group and good relationships between observations in each group. Thus, it can be concluded that the clustering performed with DBSCAN has achieved satisfactory separation and cohesion among the groups, along with suitable connectivity.

Table 5 presents the maturity classification of the materials according to the algorithm used. This table shows the similarities and differences in the classifications made by the algorithms. The k-means and DBSCAN algorithms are more similar to each other than to k-medoids, as they coincide for 32 materials in terms of the maturity type assigned. For k-means and k-medoids, 26 coincidences were found in the maturity type assigned. Lastly, k-medoids and DBSCAN only have 20 coincidences. Thus, the DBSCAN algorithm proved to be the most effective, as the group means were not statistically equal.

3.2. Ratoon Cycle

For this cycle, incorporating only the random intercept was sufficient to improve the model fit (Model 3), which allowed for capturing the variation in the intercept, indicating notable differences in the initial SP among the different genotypes. Table 6 shows that both the intercept and the slope of the fitted model for the ratoon cycle are significant, with narrow confidence intervals, indicating high precision in the estimates. Additionally, the model satisfies the assumptions of normality (Kolmogorov-Smirnov, p-value = 0.85), homogeneity of variances (Bartlett, p-value = 0.34), and independence (Box-Pierce, p-value = 0.23). The significant variability among the subject intercepts (

{\hat{σ}}_{u_{0}} = 1.18

) and the moderate residual variability (

{\hat{σ}}_{e} = 1.05

) suggest that the model adequately captures the variation present in the data.

The use of growth curve models reveals that the materials exhibit different SP growth curves. These differences are observed among hybrids, varieties, cycles, and even within the same hybrid, which can show distinct behaviors from one cycle to another.

Table 7 displays the estimated parameters for the 39 materials. Similar to the plant cycle, a higher intercept is observed in varieties known to have an Early maturity: ITV 92-1424 had an intercept of

8.30

, CP 72-2086 had an intercept of

9.35

, and COLMEX 95-27 had an intercept of

9.37

. Higher intercepts were also observed for the Early-Intermediate maturity variety: COLMEX 94-8 had an intercept of

9.20

. A lower intercept is observed in the MEX 69-290 variety, which had an intercept of

7.16

.

The estimates of the models for the plant and ratoon cycles show some differences in terms of the estimated intercept. For instance, the COLMEX 95-27 variety had an SP of

7.56