4.2.1. Optimal Parameters
Due to the significant spatial heterogeneity exhibited by most optimal parameters in PSO and SCE, this study conducted a statistical analysis of spatially homogeneous (or heterogeneous) parameters based on parameter type classifications (see
Figure S1-1, briefed in
Table 2). For the "Vegetation" type, except for the SCE scenario considering CCS, the number of homogeneity parameters in other scenarios is zero, indicating heterogeneity. Regarding the "Soil" type, the counts (
Hp,
Hs) of homogeneity parameters for PSO and SCE calibration schemes based on EKGE, EMO, MAES, and RMSES metrics are (1, 3), (2, 2), (1, 1), and (2, 1), respectively. For the "General" type, the (l1, l2) based on CCS, EKGE, EMO, MAES, and RMSES metrics are (1, A), (2, 2), (2, 2), (1, 1), and (1, 1), respectively. For the "Initial" type, the (l1, l2) based on EKGE, EMO, and MAES metrics are (4, 2), (3, 2), and (2, 1), respectively. Evidently, among all pairs, the spatial homogeneity of optimal parameters for all "Vegetation" types in PSO and SCE is relatively minimal, suggesting the strongest heterogeneity. Conversely, "Soil" and "General" types exhibit minimal spatial heterogeneity, while "Initial" types fall in the middle. Notably, QTZ and SBETA parameters consistently demonstrate homogeneity, below the parameter space threshold (0.03), across PSO and SCE schemes based on EKGE, EMO, MAES, and RMSES metrics.
Regarding the counts of homogeneity parameters in PSO and SCE schemes, when considering the disparities among metrics, we observe the following: for CCS, the counts are 1 and 40, respectively; for EKGE, both schemes yield 7; for EMO, the counts are 7 and 6; for MAES, 4 and 3; for NSES, 2 and none; both PKGE and PMO register none; and for RMSES, the counts are 5 and 2. Evidently, there exist substantial variations in the homogeneity or heterogeneity of parameters among calibration schemes based on different metrics. Notably, CCS exhibits the lowest parameter heterogeneity, followed by EKGE, then EMO, and subsequently MAES and RMSES. NSES displays relatively poor parameter heterogeneity, whereas PKGE and PMO manifest the highest degree of parameter heterogeneity.
In addition, the inter quartile ranges (IQR) of various parameters and the entire parameter space of PSO and SCE contributed by different metrics are shown in
Figure 5. For PSO, the maximum IQR, which has about 4.8 of the parameter SNP in the “Vegetation” type, had the largest uncertainties, while the EMO made the largest contribution. However, the IQR, which is about 1.2 of the SBETA parameter in the “General” type, behaves oppositely, while EKGE, EMO and RMSES make the smallest contributions (
Figure 5a). For SCE, the maximum IQR around 1.82 of the CZIL parameter in the “General” type has the largest uncertainties, while EMO has the largest contribution. Nevertheless, the IQR that is around 0.61 of the parameter CSOIL in the “General” type behaves conversely, while EKGE, EMO and RMSES make the smallest contributions (
Figure 5b). In general, PSOs have achieved higher IQRs than SCEs on most metrics. And PSO and SCE achieved the lowest uncertainties of the parameters SBETA and CSOIL in the “General” type.
The Inter quartile Range (IQR) distribution of the global optimal parameter space for the PSO schemes across various metrics exhibits a broader and more scattered range compared to that of the SCE (
Figure 5c). For PSO, the median sizes of the IQR distributions (IQRD) within the global optimal parameter space, ranked from highest to lowest, are PKGE > PMO > NSES > CCS > EMO > EKGE > RMSES > MAES. In contrast, for SCE, the order is PMO > PKGE > EKGE > EMO > MAES > RMSES > CCS. Furthermore, for PSO, the number of outliers in the IQRD is highest for EKGE with 3, followed by EMO and CCS with 2, while the rest of the metrics have 0 outliers. For SCE, EKGE and RMSES share the highest number of outliers at 2, followed by EKGE and MAES with 1 outlier each, and the rest are 0 (
Figure 5d). In summary, significant differences exist in the IQRD of the global optimal parameter spaces across different metrics, with SCE exhibiting smaller IQRD but relatively more outliers. Notably, EKGE and EMO exhibit relatively large numbers of outliers in both PSO and SCE.
Furthermore, the PSO’s parameter spatial IQRs are compared with SCE in different types in
Table 3, e.g., the parameter number with less uncertainties (
PNL) and the outlier number reduction of parameters’ uncertainties (
ONR) in PSO when compared to SCE. For the "Vegetation" type, all metrics are null except for the
PNL value of EKGE, which is 2, while the
ONR of all metrics is non-positive. For the "Soil" type, the
PNL values are positive for all metrics except for CCS, NSES, and PKGE, which are null. The
ONR values are positive for EKGE, EMO, PMO, and RMSES, while negative for the rest. Regarding the "General" type, all metrics exhibit positive
PNL values except for NSES, PKGE, and PMO, whose
PNL values are null. The
ONR values are positive for EKGE, EMO, MAES, and RMSES, and negative for the others. In the "Initial" type, only EKGE and EMO have positive
PNL values, with the rest being null. The
ONR values are positive for all metrics except for CCS, PKGE, and PMO, which are non-positive. In summary, summing the
PNL values across types, EKGE has the highest total (8), followed by EMO and RMSES (7), then MAES (5), with PMO and CCS having the lowest totals (1). PKGE has no
PNL value. For the
ONR values, EMO has the highest total (9), followed by EKGE (3), then RMSES (3), while PMO has the lowest (2). The rest of the metrics have negative
ONR values.
In summary, for the SM-ST calibration of the same metric, SCE consistently achieves lower parameter uncertainty than PSO, albeit at the cost of relatively higher spatial heterogeneity. Specifically, in terms of parameter uncertainty, MAES in PSO and CCS in SCE exhibit the smallest metrics. As for parameter spatial heterogeneity, EKGE and EMO in PSO yield the smallest metrics, while SCE solely displays the smallest EKGE.
4.2.2. Effectiveness and Efficiency
Figure 6 shows the different metrics’ fitness (i.e., the best position of one population, Pb) curves of calibration, the median convergency position, and the median converged Noah run numbers of PSO (
) and SCE (
) for all sites. For CCS, both PSO and SCE had both sharply increased before 3,000 Noah runs, and both converged to 1 but at around 79,475 and 66,663 runs respectively. For EKGE, PSO and SCE have both sharply increased before 10,000 Noah runs but converge to 0.56 at 99,017 runs and 0.53 at 90,731 runs respectively. For EMO, PSO and SCE both decrease to 1 before 8,000 Noah runs but converge to 1 at 99,297 runs and 1.08 at 82,709 runs respectively. For MAES, PSO and SCE both quickly decrease to the range of 0.7-1.1 before 10,000 Noah runs but converge to 0.79 at 99,765 runs and 0.81 at 94,795 runs respectively. For NSES, PSO and SCE have both instantly reaching 1 at 187 runs, indicating the most rapid convergence among all metrics. However, for PKGE and PMO, since volatile finesses (e.g., who vary within
and
respectively) are found for all sites in each generation, nonstrict solutions can be observed. For RMSES, PSO and SCE both sharply decrease to 1 before 5,000 Noah runs but converge to 0.97 at 99,391 runs and 0.98 at 94,029 runs respectively.
Generally, except PKGE and PMO, other metrics of PSO have achieved better effectiveness as indicated their better fitness values, but with relatively worse efficiency as indicated their larger converged runs compared to those of SCE. The non solution performance for the metric PKGE and PMO of both PSO and SCE have indicated their requirements of more Noah runs in achieving convergence, or the potential failure of the Paetro dominated logic (i.e., that surface improvement likely improve the subsurface). For MAES, NSES, and RMSES, fitness curve of site C4 is found to be notably biased from (or worse than) that of other sites. Nevertheless, for all the metrics’ convergences, MAES has the largest range, and this could indicate the divergent convergence domain.
Figure 7 presents the success rate curves for calibration across various metrics. For CCS, PSO experiences a decline from 70% to 20% during the first 10,000 Noah runs, followed by a gradual decrease to near zero. In the case of EKGE, PSO initially shows a decline from 80% within the first 5,000 Noah runs, subsequently exhibiting two distinct patterns: fluctuations around 40% and 20%, respectively. For EMO, PSO drops from 80% to nearly 0% within the initial 25,000 Noah runs, with some stations subsequently exhibiting strong fluctuations between 0% and 80%. MAES follows a similar trend, with PSO declining from 80% to near 0% within the first 15,000 Noah runs, and subsequent intense fluctuations between 0% and 80% at certain stations. For NSES, PSO gradually decreases from 80% to 20% within the first 35,000 Noah runs and remains stable thereafter. PKGE and PMO exhibit similar behavior, with PSO slowly declining from 80% to 20% within the first 20,000 Noah runs and fluctuating slightly around 20% thereafter. SCE's performance in PKGE resembles that of CCS. In contrast, RMSES displays a fluctuating decline from 80% to 0% within the initial 20,000 Noah runs for PSO, followed by drastic fluctuations between 20% and 80%. However, SCE consistently demonstrates a rapid initial decrease from 80% to 20% across nearly all metrics, maintaining this level thereafter.
For all metrics, the search domain of SCE exhibits a consistent pattern, characterized by an L-shaped thin linear region. In contrast, PSO's search domain displays significant fluctuations and notable variations across different metrics (e.g., EKGE, EMO, MAES, RMSES), albeit with an overall larger area than SCE. This suggests that for most metrics, PSO demonstrates stronger evolutionary capabilities compared to SCE, which primarily contributes to PSO's slightly slower convergence rate compared to SCE.
Figure 8 presents the statistical performance of the optimal objectives across all stations for various metrics. For CCS, both PSO and SCE exhibit a concentrated distribution near 1, with PSO displaying a tighter clustering and an outlier at 0.973. In the case of EKGE, PSO and SCE concentrate around 0.58 and 0.53, respectively, with PSO showing a more focused distribution and an outlier at 0.34. For EMO, PSO and SCE are centered near 1 and 1.1, respectively, with PSO displaying a relatively dispersed distribution and an outlier at 1.5. MAES values for PSO and SCE are centered around 0.79 and 0.81, respectively, demonstrating similar distributions. For NSES, PKGE, and PMO, both PSO and SCE have concentrated distributions near 1, with NSES exhibiting a more tightly clustered distribution compared to the other two metrics. Finally, for RMSES, PSO and SCE are centered around 0.9 and 1.1, respectively, with SCE displaying a more focused distribution, and both having outliers at around 2.4.
It is evident that for the optimal solutions of PKGE and PMO, both PSO and SCE yield values of 1, indicating the absence of optimal solutions or the need for more time to locate them. In contrast, numerical optimal solutions were achieved for other metrics. Furthermore, while PSO consistently outperformed SCE in attaining better optimal solutions across almost all metrics, significant variations were observed in the enrichment levels of optimal solutions between PSO and SCE under different metrics. For instance, PSO surpassed SCE in CCS and EKGE, whereas SCE surpassed PSO in EMO, MAES, and RMSES. Notably, PSO and SCE exhibited similar performance in NSES. This underscores the disparate spatial variability characteristics of optimal solutions influenced by distinct metrics (whereby the enrichment levels of optimal solutions at different sites reflect the extent of spatial variability). Additionally, notable outliers were identified in PSO's performance within CCS, EKGE, and EMO metrics, while both PSO and SCE exhibited outliers in the RMSES metric. This indicates that for RMSES, unquantifiable factors within the spatial variability of optimal solutions are more pronounced, whereas for other metrics, PSO's performance relative to SCE is more significantly influenced.
In summary, apart from PKGE and PMO, for other metrics, PSO typically exhibits better optimal solutions, i.e., enhanced effectiveness, compared to SCE, albeit at the cost of relatively lower efficiency. Notably, for CCS, EKGE, and RMSES, the optimal solutions obtained by PSO demonstrate higher kernel densities than those by SCE. Conversely, for EMO and MAES, the performance trend is reversed.
4.2.3. Optimal Simulation
Figure S2-1 presents linear fitting (
s,
r²) between simulations and observations of
and
under varying metrics. For
, PSO
s (in descending order) are EMO, EKGE, RMSES, MAES, PMO, NSES, CCS, PKGE, with
r² values also descending from EMO to PKGE. In contrast, SCE slope (
s) are EMO, PMO, EKGE, MAES, PKGE, NSES, RMSES, CCS, with
r² following a similar but slightly different descending order. For
's linear fitting (
Figure S2-1-2), PSO
s are EKGE, EMO, MAES, RMSES, CCS, NSES, PKGE, PMO, while
r² values show a distinct ordering: PMO, followed closely by EMO/PKGE, then MAES/RMSES/NSES, EKGE, and finally CCS. SCE fitting for
exhibits a different ordering for
s (EKGE, EMO, CCS, RMSES, MAES, NSES, PMO, PKGE) and
r² values (PKGE, PMO, NSES, EKGE, EMO, RMSES, with CCS and MAES closely grouped).
Generally, for
, except for NSES, PKGE, and PMO metrics, both PSO and SCE exhibit negative
s values, while the rest are positive (
Table 4). This indicates that most linear relationships between calibrated simulations and observations are positively correlated, which aligns with the improvement objectives of this study. Specifically, for EMO and EKGE, the
s values of PSO (SCE) in the calibration of
and
are 0.96 (0.83) and 0.18 (0.23), respectively, showcasing the optimal calibration performance (
Figure 9). Furthermore, it is noteworthy that for
, the highest
r² value of 0.11 is comparable to the lowest
r² value observed in
(PKGE), implicitly suggesting a greater challenge in modeling
.
The Gaussian fitting, i.e., center (frequency) as
c(
f) with units of
(1), of
for
in
Figure S2-2-1 reveals: CTR’s
are widely distributed, peaking at ~0.15 (f=297). CCS, PSO, SCE errors span widely around -0.04, 0.11 (f≈350, 295). EKGE's PSO, SCE errors narrowly center at 0 (f≈1276, 608). EMO's PSO, SCE errors narrowly peak at 0, 0.01 (f≈1178, 700). MAES's PSO, SCE errors widen slightly at 0.01, 0.02 (f≈344, 416). NSES's PSO, SCE errors are wide at 0.05 (f≈274, 230). PKGE's PSO, SCE errors widely center at 0.08, 0.11 (f≈322, 325). PMO's PSO, SCE errors narrowly peak at 0.02, 0.03 (f≈480, 444). RMSES's PSO, SCE errors narrowly center at -0.02, 0 (f≈426, 296). Moreover, The Gaussian fitting, i.e., center (frequency) as c (f) with units of K (1), of OBS-SIM for
(
Figure S2-2-2) shows: CTR errors have a wide bimodal dist. centered at ~7.1, -3.8 (f≈192, 134). CCS, PSO, SCE errors widely center at ~2.3, 1.1 (f≈216, 167). EKGE's PSO, SCE errors widely center at ~1.3, 2.5 (f≈200, 203). EMO's PSO, SCE errors center at ~0.85, 1.23 (f≈170, 207). MAES's PSO, SCE errors center at ~-0.06, 0.88 (f≈200, 230). NSES's PSO, SCE errors widely center at ~5.86, 5.03 (f≈169, 213). PKGE's PSO, SCE errors widely center at ~4.91, 5.01 (f≈237, 152). PMO's PSO, SCE errors widely center at ~6.1, 5.19 (f≈300, 224). RMSES's PSO, SCE errors center at ~0.16, 1.29 (f≈200, 206).
In summary, for
of
, EKGE's performance in both PSO and SCE is closest to a normal distribution, whereas for that of
, MAES exhibits the closest resemblance to normality (
Figure 10), with EKGE performing relatively poorly (
Table 5). This underscores the significant influence of metric discrepancies on calibration simulation errors, contingent upon distinct calibration objectives. Furthermore, excessively wide peaks with low frequencies in unimodal distributions (e.g., CCS, NSES, PKGE, and PMO) indicate the dispersed fitting distribution, potentially necessitating the multimodal (e.g., more than two peaks) fitting. Conversely, bimodal distributions characterized by narrower peaks may call for a single-peak fitting centered around the modes.
Figure 11a depicts temporal
(
) variations for
. CTR's
is generally largest, 0.15 (decreasing during July 5th and 10th rainfalls), with a slight upward trend. For CCS, PSO's
0.15 increases slightly, while SCE's
fluctuates around 0.07. EKGE and EMO show PSO(SCE) RMSEs of 0.01(0.03) and 0.01(0.02), respectively, both trending downward. MAES's PSO/SCE
0.04, both declining. NSES's
around 0.07, up trending. PKGE's PSO(SCE)
0.12(0.1), up trending. PMO's PSO
decreases from 0.1 to 0.05, SCE's 0.07, slightly down. RMSES's PSO(SCE)
0.04(0.05), both declining. Moreover,
Figure 11b illustrates the overall RMSES distribution for
. Median
ranking from highest to lowest for PSO: CCS (0.13) > PKGE (0.12) > NSES (0.08) > PMO (0.07) > RMSES (0.039) > MAES (0.038) > EMO (0.018) > EKGE (0.017); for SCE: PKGE (0.1) > CCS (0.09) > NSES (0.085) > PMO (0.065) > RMSES (0.056) > MAES (0.039) > EKGE (0.03) > EMO (0.02). Notably, EKGE and EMO exhibit the lowest median
for PSO and SCE, respectively, whereas CCS and PKGE have the highest.
Figure 11c depicts temporal variations in spatial correlation coefficients (
) for
simulations. CTR's
significantly drops Jul 5-6 (0.5 to -0.6), fluctuating at ~0.2 otherwise. CCS: PSO's
stable at -0.5, SCE increases Jul 5 (-0.4 to 0.5), fluctuating -0.2. EKGE: PSO 1, SCE 0.8. EMO: Both are ~1. MAES: PSO increases (0.2 to 1), SCE initially declines (0.4 to 0), then ~0.2. NSES: PSO ~0.7, drops post-Jul 10 to ~0.5; SCE ~0.2. PKGE: PSO ~0.45, sharp drop Jul 5 to ~-0.4; SCE -0.4 to unspecified, sharp drop, ~-0.3. PMO: PSO 0.6, sharp drop Jul 6 to -0.5, rises to 0.2; SCE 0.8, drops to 0, rises to 0.4. RMSES: PSO increases (0.5 to 0.8), stabilizes ~0.8 post-Jul 4; SCE ~0.45, declines Jul 4-6, rises to ~0.26. EMO and EKGE consistently outperform MAES for PSO and SCE in
, with other metrics displaying varied trends. Moreover,
Figure 11d illustrates the overall
distribution for
. Median
ranking from highest to lowest for PSO: EMO > EKGE > MAES > RMSES > NSES > PMO > PKGE > CCS; for SCE: EKGE > EMO > MAES > PMO > PKGE > NSES > RMSES > CCS. Notably, EMO and EKGE exhibit the highest median
for PSO and SCE, respectively, whereas CCS have the lowest.
For
, CTR's
shows a marked diurnal variation, averaging 8K fluctuations (
Figure S2-3-1). Due to overlapping diurnal error ranges, its performance complexity surpasses
. Notably, NSES, PKGE, PMO peak
>14K (CTR's max), indicating inferiority (
Figure 11e). Conversely, MAES and RMSES peak at 8K, surpassing CTR. EKGE and EMO, excluding initial days, also peak near 8K, outperforming CTR. Median
(K) ranking from highest to lowest yields the following order for PSO: PMO (7.5) > PKGE (6) > NSES (5.8) > CCS (4) > EKGE (3.5) > MAES (2.8) > RMSES (2.5) > EMO (2.48); and for SCE: PKGE (6.1) > PMO (5.8) > NSES (5.6) > CCS (3.6) > EKGE (3.3) > MAES (2.9) > RMSES (2.7) > EMO (2.5) (
Figure 11f). In both PSO and SCE, EMO exhibits the lowest median
, whereas PMO and PKGE respectively possess the highest.
Furthermore, for
, CTR's
varies from -0.5 to 0.7, showing distinct diurnal patterns (
Figure 11g). Overlapping diurnal error ranges complicate performance compared to
(
Figure S2-3-2). CCS and EKGE's max
< 0.7 (CTR's max), indicating inferiority. NSE, PKGE, PMO max
rival CTR, but min
> -0.5, outperforming CTR. EMO, MAES, RMSE max
~0.8, exceeding CTR. Hence, for
,
performance ranks EMO, MAES, RMSE best, followed by NSE, PKGE, PMO; CCS, EKGE perform bad. Moreover,
Figure 11h illustrates the overall
distribution for
. Median
ranking from highest to lowest for PSO: EMO > MAES > RMSES > NSES > PMO > PKGE > EKGE > CCS; for SCE: EMO > RMSES > CCS > EKGE > MAES > PKGE > PMO > NSES. Notably, EMO exhibit the highest median
for both PSO and SCE, whereas CCS and NSES have the lowest.
In summary, for , EKGE and EMO exhibit the lowest median and the highest median for PSO and SCE, respectively, whereas CCS and PKGE have the highest for PSO and SCE, respectively, and CCS have the lowest for both. For , in both PSO and SCE, EMO exhibits the lowest median , whereas PMO and PKGE respectively possess the highest; EMO exhibit the highest median for both PSO and SCE, whereas CCS and NSES have the lowest. Generally, EKGE and EMO have the best and performances of for PSO and SCE respectively, while EMO has the best and performances of for both.