The cutoff yield value of 12.000 kg ha
-1 was used to classified 186 examined specimens into two distinct subpopulations: a productive subpopulation covering 19 specimens (10%) and a less productive subpopulation with 167 specimens (90%). This high prevalence in the less productive subpopulation indicates an unbalance in nutrient equilibrium or an elevated rejection rate of the nutrient balance. By employing the χ
2 function with six degrees of freedom, this proportion of 90% yields a theoretical GNII
theoretical value of 2.2 which is of significant importance, as it offers valuable indications of potential nutritional imbalances within the studied population (
Figure 1). When this nutrient signature exceeds 2.2 GNII threshold, the specimen indicates an increased risk of nutritional imbalance and implies that the distribution pattern of the five nutrients in corn kernels may be inadequate, leading to reduced productivity.
3.2.1. Nutrient Imbalance Index Threshold Validation
By applying the Cate Nelson partition method, a critical value of 11.000 kg ha
-1 was identified which correspond to GNII value of 1.6 allowing the splitting of the dataset into four quadrants (
Figure 2). The four quadrants and their corresponding outcomes are i) True Positive (TP, n=24): Corn grains with high yields are correctly diagnosed with the global nutrient imbalance index (GNII). These grains exhibited high yields ≥ 11000 g ha
-1 and were accurately identified as nutrient balanced based on their GNII values ≤ 1.6; ii) True Negative (TN, n=98): Corn grains with low yields are correctly diagnosed with the GNII. These grains had low yields of < 11.000 kg ha
-1 and were accurately identified as having imbalanced nutrient compositions according to their GNII values > 1.6; iii) False Negative (FN, n=5): Corn grain with high yields are incorrectly diagnosed with the GNII. These grains had high yields ≥ 11000 kg ha
-1 but were mistakenly identified as having an imbalanced nutrient composition based on their GNII values > 1.6, when in fact, they were nutrient balanced. iv) False positive (FP, n=59): Corn grain with low yields are incorrectly diagnosed with the GNII. These grains had low yields < 11.000 kg ha
-1 but were wrongly identified as having nutrient balance based on their GNII values ≤ 1.6, when they had imbalanced nutrient compositions. The high limit of 11.000 kg ha
-1 for corn grain yield was determined by minimizing the number of points in the error quadrants, which consist of FN and FP values. The combined points from these two quadrants reached a total of 64 out of the overall 186 points. The use of GNII threshold, represented by the peak sum of squares (
Figure 2c), enable a precise differentiation between balanced and imbalanced nutritional states, streamlining the identification of corn kernels with optimal nutrient composition and high yield potential. This classification is essential for making informed decisions on crop management and nutritional interventions to improve crop productivity and overall agricultural performance.
The proportion of points in the TP and TN quadrants compared to all points in the dataset measures the robustness the Cate-Nelson procedure, expressed as = R2 with a calculated value of 65%. This value implies that 65% of the total population had a correctly identified balanced or imbalanced nutrient status, supporting the validity and reliability of the proposed Cate-Nelson model.
The positive predictive value (PPV) reflects the probability that a corn kernel is suitable for a balanced nutritional state with a GNII below 1.6. In this case, the PPV calculated value of 29%represent the chance that a corn kernel (GNII below 1.6) is balanced in terms of its nutritional status. Alternatively, the PPV represents the proportion of true positives (correctly identified balanced kernels) out of all the grains placed as balanced by the model. With a PPV of 29%, the model has some limitations in accurately identifying truly balanced corn kernels, as there is a significant number of false positives (corn kernels identified as balanced but are unbalanced). Alternatively, the negative predictive value (NPV), which represents the probability of a low yield response to an imbalanced nutritional state of the grain (GNII > 1.6), was calculated at 95% which represent the probability that a corn kernel (GNII greater than 1.6) will indeed show an imbalanced nutritional state and low yields.
Other parameters such as sensitivity and specificity have been calculated to evaluate the performance of the adapted Cate-Nelson model. The calculated sensitivity ([TP/ (TP + FN)]) was 82%, which is the probability of making the right decision (GNII threshold) against all observations with a yield stability (yield cutoff). The calculated specificity of 62% represents the probability of lower corn grain yields with an imbalanced grain nutrient status (GNII>1.6).
Figure 2 illustrates the fluctuation of the sum of squares, presenting an analysis of variance. It showcases peak around 1.6, beyond which a transition occurs from significantly high yields to considerably low yields. This peak represent the optimal point to distinguish between a highly productive population and a less productive one. The validity of the yield cutoff value is supported by its nearness to the GNII
Therotical value.
The 1.6 and 2.2 values enabled the classification of the specimens into three groups: a highly balanced composition where GNII ≤ 1.6, a balanced range where 1.6 < GNII ≤ 2.2, and an imbalanced category where GNII > 2.2.
These findings suggest that the variability observed in Cate Nelson's four quadrants is intricately linked to the specimens under examination, displaying discernible variations attributed to environmental factors and the specific cultivar considered as a genetic factor. Soil pH, organic matter, and precipitation play crucial roles in plant growth and development, exerting a remarkable impact on specimen characteristics, as validated by the results of feature importance (
Section 3.2.1.1). Furthermore, genetic variations among treated cultivars were identified as a significant contributor to the observed variability. Understanding the mechanisms that determine variability in plant attributes is critical to better understand the ecological processes involved in their adaptation to the environmental changes and to guiding efforts to select the most adapted cultivars to the local conditions. Hence, each factor was studied from the most predictive variable to the lowest predictive one for GNII.
3.2.2. Environmental Variables Affecting the Determination And Forecasting of the GNII Value
To predict GNII as a function of variables (Soil pH, SOM, Cultivar, Rainfall, and country), Xgboost and Random Forest (RF) gave comparable results. Comparatively to RF (Robustness (R2) = 60, Root Mean Squared Error (RMSE) = 4.730, intercept =8.70 and accuracy (Slope)= 60%) the Xgboost model (Robustness (R2) = 65%, Root Mean Square Error (RMSE) = 4.450, intercept =5.94, and accuracy (Slope) = 70%)) was selected due to its performance, exhibiting an R² and slope nearing 100%, while its intercept and RMSE are close to zero.
Figure 3b depicts each variable's importance in the GNII prediction. We determined the sequence of influence from the obtained scores, revealing that the soil pH variable emerged as the most influential predictor in the GNII, with a dependence score of 40.36%. On the other hand, the country variable exhibited the most negligible impact, with a dependence score of 9.30%.
Soil pH plays a critical role in predicting the GNII values. The GNII values exhibit substantial variability within the pH range of 5.5 to 8.2, as demonstrated in (
Figure 3c). At a pH of 5.5, the GNII value is approximately 10, indicating a highly imbalanced nutritional profile. Furthermore,
Figure 3c clearly illustrates a distinct decline in GNII within the pH range of 6 to 6.5, suggesting a consistently low value. In this range, the population can be classified as having an excellent nutritional balance, with GNII values hovering around 2.2. Contrarily, GNII values exceeding 6.8 pH fall within a region marked by minimal yield and a notably unbalanced nutritional profile. This indicates that pronounced imbalances are prevalent under extreme conditions, specifically in highly acidic environments with a pH below 5.5, as well as in alkaline scenarios where the pH surpasses 6.8.
The variability in soil organic matter (SOM) also influences the variation in GNII values (
Figure 3d). Based on frequency, the results mainly focus on three peaks corresponding to the following values of SOM: 9.67, 13.55, and 25 g kg
-1. The first peak, observed at a value of 9.76 g kg
-1, resulted in an imbalance index of approximately 12. This indicates a population with a highly imbalanced nutritional profile, highlighting a deficiency of SOM in establishing a nutritional balance. However, a slight increase in SOM to around 13.55 g kg
-1 led to a decrease in GNII below 2.5. This falls within the zone representing a population with a nutritional balance. Beyond this point, a second peak emerged at SOM concentration of 25 g kg
-1, leading to a GNII value exceeding 15. This indicates a significant nutritional imbalance. Another factor, according to the Features importance results, highlights how cultivar significantly affects both production variability and GNII (
Figure 3e). Categorizing nutrient indices into distinct zones such as highly balanced, balanced, and imbalanced, would be beneficial. This systematic classification could aid in effectively grouping cultivars. It would also provide insights into their yield responses and GNII, facilitating a deeper understanding of their agricultural characteristics and performance. In the high-potential Class 1, cultivars like Suwan 1, Zhengdan 958, and 3312ET are notable for their low GNII values, which are below 1.6. This signifies a very balanced nutritional profile, as depicted in
Figure 3e. Such a nutritional signature is invaluable for selecting the most suitable cultivars for specific agrosystems to ensure enhanced yields. Meanwhile, Class 2 comprises cultivars with balanced nutritive composition, including Dekalb DK221 and Pioneer brand cultivars cv. 3223 and 31G98, also illustrated in
Figure 3e. The third category, identified as the imbalanced zone, is distinguished by a significantly unbalanced nutritional profile. This zone encompasses cultivars such as CV. Nakhon Sawan 3, 3624, 4550, BARI Hybrid Bhutta-9, and DK888. Each of these cultivars, as illustrated in
Figure 3e, demonstrates characteristics that categorize them within this zone due to their less optimal nutrient balances. Recognizing these cultivars is crucial for understanding and managing their impact on agricultural outcomes. These classifications provide critical insights for informed cultivar selection in agricultural practices.
An additional crucial element in predicting GNII values is illustrated in
Figure 3f, which presents the variation of GNII values across different countries. Notably, Bangladesh and Thailand display the highest GNII values, with scores of 12 and 11, respectively. This suggests that these nations predominantly cultivate nutritionally imbalanced corn varieties, starkly contrasting countries like the USA, Canada, and China. It is important to note that the nutritional balance of corn can significantly differ from country to country. Various factors, including climate, soil quality, farming practices, and genetic diversity, influence this variation. Understanding these nuances is essential for a comprehensive analysis of GNII values on a global scale. The predominant pedoclimatic factor influencing GNII values is rainfall, as illustrated in
Figure 3g. This section underscores the critical role of rainfall variability in affecting GNII fluctuations. Notably, when rainfall is within the range of 750 to 1100 mm, the GNII values tend to be low, not surpassing 2, which is indicative of an excellent nutritional balance in the crops. Conversely, higher rainfall levels, specifically between 1200 and 1500 mm, are associated with two significant peaks in GNII values, approximately 11.28 and 13, respectively. These peaks are indicative of a high nutritional imbalance, highlighting the strong correlation between increased rainfall and nutrient imbalances in agricultural contexts.