We present a modeling framework for the sensitivity of the maize yield predictions to uncertainties in climate. The framework couples a GxE model that integrates the co-variability of environmental and maize genetic molecular markers and the PAWN global sensitivity analysis. The GSA-GxE modeling framework supports the thesis that integrated genetics, climate, and their interactions contribute to identify the climate variables responsible for the improvement of the predictability of maize yields in US-CA. We consider that the effects of climate on maize predictability can shed some light on how crops respond and adapt to spatiotemporal fluctuations in climate and our abilities to capture such patterns of variability and crop responses in collected data, biophysical, statistical, and data models [
8,
11,
12,
17,
80,
81]. The selection of rainfall, solar radiation, temperature, and relative humidity to create the covariance matrices for the GSA’s conditional phase followed studies that indicate their influence on maize growth and production [
24,
26,
70,
82].
Table 1 illustrates the range of observed values used in the nonconditional phase, which represent the climate variations occurred between 2014 and 2017. These ranges were used to generate the 100 nominal values for the selected variables (i.e.,
Tmin,
Tmean,
Tmax,
SRmin,
SRmean,
SRmax,
Racc,
RHmin,
RHmean, and
RHmax) in the conditional phases. It is noteworthy that the developed GSA-GxE framework can be expanded to include other climate or environmental variables released in [
19] and [
74].
3.1. The Environmental Covariance Matrix
The unconditional phase of GSA-GxE has been implemented when all 15 variables are set to the observed time series at each G2F experiment, and the unconditional
is calculated. The conditional phase of coupled GSA-GxE framework has been iterated for each of the 100 generated nominal values, and in each iteration, the conditional
is computed. The covariances values quantify the environmental similarity using environmental co-variability between pairs of the G2F experiments time series. In other words, the covariance function measures the joint variability of the G2F experiments’ hydroclimatic time series by synthesizing the co-variability of 15 climatic variables.
Figure 4 shows the histograms of covariance values in the unconditional phase (in gray color) and conditional phase for each conditional variable, including temperature, solar radiation, rainfall, and relative humidity.
The unconditional
is the same for any given variable as the conditional variable since it has been calculated based on the observed time series for all 15 hydroclimatic variables across the G2F study area. The conditional
for any given conditional variable is calculated in each iteration with the given generated nominal value. The selected nominal value, which remains constant in all G2F experiments in an iteration for a given conditional variable, does not change the joint variability of the time series between G2F experiments. Consequently, the calculated conditional
in iterations 1 through 100 remains the same since the covariance function measures how the time series of each pair of the G2F experiments covary together. Also, in
Figure 4 the probabilities of unconditional and conditional covariance values are slightly different. These slight differences align with our previous study [
72], where we coupled the GSA with
. In that study, we found the PAWN sensitivity index of
to T, SR, R, and RH equal 0.091, 0.084, 0.077, and 0.082, respectively. In the next step of the GSA-GxE, the observed slight contrasts between calculated unconditional and conditional
interacted with the genetic covariance
through the GxE model. These contrasts will be propagated by the model using the product of
and
and the ranked hydroclimatic variables, from the most to the least impactful to the maize yield predictability.
Phenotypes like grain yields are affected by genetics, environmental drivers, and the complex interactions between them [
9], meaning that the environmental similarities are not linearly affecting the yields, the predicted values by GxE models, and the resultant errors. In a study by [
83], the GxE interaction is the most important factor compared to the independent components used for maize yield predictability in the G2F layout. This complexity introduces a potential error propagation and increases the sensitivity of maize yield predictability to the GxE compared to the sensitivity of
to hydroclimatic variables.
Another complexity in maize phenotype predictability is that the tested maize varieties differ across the designed G2F experiments [
18]. This genetic variability among the trials is considered in the GxE model through genetic covariance (
). Similarly, to
, which quantifies the similarity among the environments based on hydroclimatic time series, the calculated G measures the similarity among the maize varieties using molecular markers [
83]. These variations in the molecular genetic markers lead to different phenotypic responses to climate conditions. For example, [
84] showed that the responses of different maize species with different thresholds of tolerance are affected differently by temperature means and extremes. Thus, the effect of hydroclimatic variables and their interaction with genetic markers through the environmental covariance (
) and genetic covariance (
) on the maize yield predictability can be estimated.
As mentioned above, the study of [
72] contrasted the conditional and unconditional
to calculate the sensitivity of covariance values to fluctuations in the hydroclimate in one iteration by coupling GSA and
. In the present study, we introduced the GSA-GxE coupling, extending the number of iterations for verification purposes. The test the GSA-GxE framework we used a four-year dataset with a limited number of trials, mainly over the eastern and central US. According to this testing procedure, we could miss the effects of long-term modes of climate variability and their co-variability with genetics. This data limitation can be tackled by releasing and using new hydroclimatic data in a more significant number of G2F environments over time and space scales, which may enhance the model predictability [
17]. Nevertheless, the proposed GSA-GxE methodology can be expanded to other locations and tested with datasets other than G2F. Using released environmental and OMICs datasets from other crop breeding programs, such as the International Center for Maize and Wheat Improvement, is also recommended to test and enhance the proposed sensitivity analysis framework.
3.2. The GSA-GxE Framework
The sensitivity analyses have been explored from multiple perspectives [
57,
85], including those aimed to identify the main drivers of environmental change using physical and data driven models [
58,
59,
60]. In crop phenotyping diagnostics and prognostics such efforts have been centered on the use of crop and Earth System models and statistical analyses of climate and crop yields [
22,
26,
70,
80,
86,
87,
88,
89,
90]. Authors [
72] introduced a PAWN’s GSA coupler for
using the G2F initiative data, which is the foundation for the GSA-GxE coupler presented here. The GSA-GxE coupler estimated the sensitivity of the GxE model performance to the constructed
, and account for the possible variations in climate as drivers of maize yields predictability.
The sensitivity of the GxE model performance to the constructed conditional environmental covariance matrix has been assessed successfully for
T,
SR,
R, and
RH, which supports the central thesis of quantifying the GxE performance sensitivity to test the hydroclimatic drivers for maize yield predictability.
Figure 5 illustrates the unconditional and conditional CDFs of the GxE model performance for
T,
SR,
R, and
RH. The 100 iterations for each conditional variable take approximately one month in a Windows system with an Intel Core i9 configuration. The codes made available to the public allow users to perform this methodology for as many iterations as they aim. The tested iterations in this study evidenced that the differences between the
SI values for all variables were minimal, indicating that such number of iterations could be sufficient to achieve the maximum SI value. The SI values show the maximum difference between the unconditional and conditional CDFs (K-S statistics) among all iterations. After completing all 100 iterations, the maximum derived K-S has been reported as the PAWN sensitivity index (
quation 16) of the GxE model performance (
R2 in Eq. (17)) for a given conditional variable.
The largest PAWN sensitivity index for the area of study is solar radiation (
SISR = 0.25). After that, temperature is the most effective climatic driver in GxE model performance (
SIT = 0.18). The sensitivity indices calculated for rainfall and relative humidity are the same (
SIR =
SIRH = 0.17). The dominance of solar radiation can be supported by biophysical crop modeling and observations. For instance, [
93] suggested that solar radiation’s effects on maize yields are often overlooked compared to other climatic factors. Their study shows that 27% of the maize production growth can be attributed to increasing solar radiation in the U.S. Authors [
94] also identified that the effects of solar radiation on maize yields surpassed those of temperature and rainfall. Yet other patterns emerge when solar radiation is compounded with an increasing variability of precipitation, leading to simulated less conspicuous changes in yields. On the other hand, using observations and physiological attributions between climate and crop development, [
26] shed some light on how photosynthesis and solar radiation drive crop development in conterminous US. Thus, the SI-aggregates in
Figure 5 are indicative of how the GSA-GxE coupler and the contrasting
SR,
R,
RH, and
T, as compound and individual feasibility spaces, evidence the contributions of climate factors to maize yield predictions in the U.S. and Canada.
The effects of markers and environmental covariates using the covariance structures introduced by [
9] and coupled to the GSA by [
61] at each location illustrate the dominance of different climate variables on maize yield predictability.
Figure 6 shows the spatial distribution of the most and second most effective climatic drivers for maize yield predictability and their associated GxE modeling performance (
R2). The most sensitive climate drivers observed in
Figure 6a indicate that
R dominates in 26 sites, while
RH,
SR, and
T are the main controls of maize predictability in 21, 20, and 17 sites, respectively.
Figure 6b shows that
RH dominates
SR,
T, and
R as the second most influential driver of maize predictability in 26, 24, 19, and 15 locations, respectively. Additionally, there is a consistent pattern in the most and the second most effective predictors are the sequence
RH,
SR, and
T. Authors [
80] indicated that crop sensitivity studies have been dominated by the assessment of how temperature and, to less extent rainfall affect crop yields. Other studies have assessed the compounded effect of temperature with precipitation deficits in shortening the crop’s growing season [
24,
29,
80,
86,
87,
95,
96,
97]. While the compounded effect of temperature and precipitation on yields can be seen as a crop’s adaptive mechanism when yields are sustained, long-growing maize varieties can be sensitive to water deficits or surpluses [
61,
88,
98]. The sequences presented here indicate the patterns of climate variability need to be further explored and explained. Authors [
10] and [
32] provide a framework to model the complex interactions driven agricultural land use in West Africa (i.e., climate, socioeconomic, and land use). Authors [
22] also highlighted the key roles of genomics and enviromics interconnections for crops phenotyping in a changing climate. However, it remains unclear how genetics and climate will interact and lead to secure agriculture in the short and long-term future.
Another perspective on the compounding effect of climate or environmental variables on maize yields and the sequence
RH,
SR, and
T in
Figure 6 can be linked to the use of observations and crop, Earth system, statistical and data modeling [
29,
80,
87,
92].
Figure 6 illustrates how the global sensitivity analysis, and the construction of environmental covariates enable the conceptualization of compounding environmental variables and identifying their individual contributions. GSA-GxE operates within a feasibility space that captures the complexity of plants’ response to spatiotemporal environmental variations. Such variations can also reflect our abilities to capture or parameterize processes using high-dimensional ecosystems of digital resources (i.e., data, parameterizations, analytics, and conceptualizations). Authors [
99] used a crop model to assess how the effects of multiple factors on crop yields are sensitive to the spatial resolution of the inputs, the parameters in the implementation of the model, and, eventually, the results. While statistical approaches have used an explicit integration of genetic-by-environment interactions into the crop yield simulations, it remains unclear how the individual factors play a role across the large-scale areas [
9,
13,
17,
92]. Authors [
25] and [
80] highlight the need to characterize the individual contributions of climate factors on crop yield predictions. The effort presented here addresses this point and explores the relative contribution of four climate variables, which scales up what [
91], and [
17] showed. Some of those changes have not been characterized in terms of the individual contributions of multiple climate factors [
23] and continue the activities launched by the G2F Initiative, including the studies of [
16,
83,
91,
100]. Furthermore, the resulting crop yield sensitivities to climate factors and their distribution across US-CAN suggest the need to identify the geospatial and temporal patterns of variability in the genetic-by-climate interactions. Such patterns and additional sources of predictability could emerge from monitoring technologies that combine unmanned aerial vehicles and eddy covariance towers [
101,
102], co-segmentation methods that enhance current computer vision- based phenotyping [
103,
104], remote sensing-based modeling for diagnostics and predictions of biophysical variables [
105], and technologies to improve best management practices. These advances can contribute to seeing how predicted weather and climate conditions can aid hybrid selection, manage cultivars during the growing season, and prevent or mitigate major impacts of extreme hydrometeorological and climate events.