1. Introduction
Flooding is one of the most impactful natural disaster in term of human, economic and financial damage worldwide [
1]. The risk of increasingly impactful and damaging widespread flood events is further exacerbated by the changes in hydrological cycles induced by global warming [
2,
3].
Flood frequency analysis is an essential field of study in hydrology which concerns the estimation of peak flow levels corresponding to a certain return period, defined as the average time between occurrences of an extreme event. Such information on high flow events is useful in many settings such as dam design for flood protection, reservoir design and management, ecology management and river pollution control [
4]. For flood risk assessment, the estimated peak flows can serve as input to a hydraulic model to simulate possible overflowing of river banks onto nearby floodplains and measure the inundation depth and extent. This could lead to the development of flood risk maps, informing on potentially vulnerable flood-prone areas. Thus, ideally extreme streamflow levels in all areas of interest need to be estimated with precision.
If all locations of interest had observed data available, this estimation problem could simply be performed statistically by fitting a probabilistic distribution to the time series of observed discharge and deriving the extreme discharge values. Common distributions used to model extreme streamflows are the Gumbel, Generalized Extreme Value (GEV), Generalized Pareto (GP), Log-Pearson type III, Log-normal and Laplace distributions [
5,
6]. Estimating high discharge values from observed time series comes with inherent uncertainty attributed to the essentially rare occurrence of extreme events [
7].
In reality, river gauges are only available in a small subset of locations, due to limited resources. Estimating river flows in ungauged catchments remains one of the most complex challenges in catchment hydrology [
8]. One possible solution, sometimes called the indirect method, involves using rainfall data fields (from climate reanalysis products or downscaled climate models) to run rainfall-runoff models and simulate synthetic flows homogeneously over an entire geographical area. However, these synthetic flows are not point estimates but mean values averaged over a grid box. As such, they are usually accompanied with substantial uncertainty, stemming from their coarser resolution, the parameterization of the rainfall-runoff model [
9] and the uncertainty in the rainfall field itself [
10].
This motivates the use of spatial data-driven models to interpolate discharge values in ungauged areas from nearby gauged basins. This field named regional flood frequency analysis (RFFA) traditionally uses the index flood method [
11] to infer extreme peak flow in ungauged locations from similarly behaving gauged basins. In this method, the region of interest is divided into hydrologically homogeneous subregions using data-driven pooling or expert knowledge. A single frequency curve is then fitted to the observed discharge values in each subregion, which is then rescaled with location specific flood indexes. Although this method remains fairly popular in the hydrology community, its simplicity has raised criticism [
12] and often does not account for the full hydrological complexity within each subregion. The underlying assumption that each subregion has a common regional flood frequency curve is also not rigorously verified [
12]. Moreover, the partitioning into subregions can be arbitrary and contributes to increase uncertainty in the model [
13].
Alternatively, Bayesian hierarchical models (BHM) are a class of latent variable model which is suitable for data which follows a certain hierarchical structure [
14]. This structure is embedded in the model using a latent process layer, where hyperparameters are introduced to share information from different data subgroups. In RFFA, data is organized spatially (grouped by stations) but also temporally (grouped by time of measurement). The use of the Bayesian framework is advantageous for multiple reasons. Because all model parameters are estimated simultaneously, uncertainty is taken into account across the whole modeling chain and is directly provided as a by-product of the model. This is beneficial for analyzing extreme flood data in general, especially when the need to quantify flood uncertainty is emphasized [
15]. Furthermore, by pooling information from gauged data across all locations, the BHM mitigates the negative effects of locations with short data records, because data sharing allows estimation in these locations to borrow strength from other data-rich stations [
14].
Bayesian hierarchical models have been used extensively and with success in RFFA [
12,
15,
16,
17,
18,
19,
20]. Some studies incorporate spatial information using gaussian random fields to correlate flood describing parameters in the latent process [
12,
16,
21]. Oftentime, regression-based techniques are integrated inside the BHM to link streamflows to suitable covariates and allow prediction of extreme floods at ungauged catchments. For example, [
17] and [
22] use the drainage area as the main predictor which scales with the location and scale parameters of the GEV distributions used to describe discharge observations. Non-stationarity in time can also be embedded as part of the BHM model if such assumptions are justified, as is the case in [
18], where peak flow event durations are allowed to vary temporally. The studies mentioned above usually limit their geographical application to a river basin [
16,
17], a region [
23] or a small country (the UK [
20], Norway [
19]).
Finally, we note that non-linear machine learning and deep learning models are progressively gaining popularity as a promising tool to perform RFFA [
24,
25,
26,
27]. Most of these studies utilize at-site flood frequency analysis to derive design flood values at gauged stations, then use a machine learning model to predict design flood values at ungauged stations. One of the main drawbacks of this approach is that extreme value theory is not leveraged to extrapolate to unseen extreme events, so the prediction of discharge levels corresponding to very high return periods can be inaccurate. Furthermore, standard machine-learning based models do not explicitly account for the statistical uncertainty, even though attempts to remedy this problem have been made for RFFA using Bayesian networks [
28].
We bring several new methodological extensions to the standard Bayesian hierarchical model to perform regional flood frequency analysis in all catchments across North America and take advantage of the enhanced data quality and quantity collected. The scale of our study is continental, using substantially more gauge data and covariates than previous similar studies. A set of enriched 130 catchment specific covariates are used inside the BHM to describe the GEV distribution parameters. This constitutes a predictive model which can be used to estimate extreme discharge levels at all ungauged catchments, using extracted information from the gauge stations. To our knowledge, previous studies using a regression approach for RFFA were limited to less than 20 catchment descriptors, providing only a simplified description of catchment hydrology. By analyzing the regression coefficients estimated for each variable, we are also able to assess and classify their relevance for predicting extreme fluvial peak flows.
Second, to improve the precision of estimates, a modification is made to the data layer likelihood which allows using both annual maxima and daily discharge data whenever available. This increases the amount of high quality data for the subgroup of stations where daily data is available and reduces model uncertainty. Annual maxima are modeled with the GEV distribution and peaks over threshold extracted from daily data are modeled with the GP distribution. Both stem from the extreme value theory, where they are proven to be the asymptotic form for the distributions of extreme random variables [
7]. For the shape parameter of the GEV and GP distributions, the sign of its value determines the upper tail decay behavior, giving the model enough flexibility to cover a wide range of flood regimes [
6].
4. Discussion and Conclusion
The spatial variations in flood extremes are modeled using 130 catchment specific covariates as described in
Section 2.1.3. The fine spatial resolution of the catchment delineation used in our approach allows capturing complex local patterns in the GEV distribution parameters and return period peak flows. In the latent process layer of the Bayesian hierarchical model, we did not explicitly use a spatial structure as commonly found in the literature [
12,
20,
23,
44]. As explained in
Section 3.1, the analysis of the regression residuals did not reveal any spatial patterns based on the Euclidean distance between the stations, suggesting that such dependencies are already accounted for by the 130 covariates. Secondly, implementing a spatial random field in the latent process would increase the computational complexity, which might be prohibitive when applied to a continental scale study. Finally, since the geographical extent is large, extensive work would be needed to choose the appropriate form of spatial structure and station distance metric to consider, which we consider outside of the scope of this study. Therefore, following the Occam’s razor principle, we adopted the more parsimonious model with random spatially independent regression errors in the latent process.
We saw that the shape parameter could not fully be described by our chosen set of covariates, as the regression errors remain substantial for the shape after fitting the BHM model. Estimating the shape parameter in regional flood frequency analysis is a notoriously complex problem [
41,
44], as this parameter is very sensitive to the data quality and length. Traditionally, to reduce uncertainty associated with its estimation, former models in the literature often assumed it to be constant across entire regions where RFFA is performed [
12,
15,
22,
45]. However, for the geographical extent used in our study, this simplistic assumption would be too restrictive because of hydrological spatial heterogeneity. As the geographical extent increases, patterns in climatology, terrain characteristics and hydrological behaviors which affect the flood regimes become more varied [
46]. Therefore, we chose to model the shape parameter as a function of the static covariates at hand, which also allows prediction in ungauged catchments. Future areas of work can involve selecting more suitable covariates to improve the predictive power for the shape parameter. For example, [
19] showed that for Norway, the ratio of snowmelt in the total precipitation is a strong predictor of the shape parameter. However, it is unclear whether such data is retrievable for North America and what would be their quality.
We chose to apply the model independently for each HydroBASINS level 2 region, where hydrological behaviors are assumed to be relatively similar, as opposed to having a unique unified model for the whole North America. Attempts to develop such a model showed a drop in model goodness-of-fit, which can be attributed to the heterogeneity of flood regimes and climatology across the continent. As a matter of fact, the Koppen-Geiger classification of climate zones in North America consists of 5-6 big climate zones, with marked variations depending on the latitude and elevation [
47].
In this study, we developed a bayesian hierarchical model for regional flood frequency analysis at high spatial resolution in North America. Using our catchment delineation, flood peakflows can be estimated for every unit catchment with a mean area of across the continent. We leverage approximately 20000 gauge station data and a rich dataset of 130 static covariates for 1.8 million catchments to build a predictive BHM and predict the GEV distribution parameters for ungauged catchments. Not only does our model account for statistical uncertainty in extreme peak flow estimates, it manages to reduce estimation uncertainty in cases where additional informative data is available. As a matter of fact, a novelty in our methodology consists in using both discharge annual maxima and excesses over a threshold in the data layer to estimate the GEV distribution parameters. This approach is able to reduce uncertainty at sites where both types of data are available, especially for the scale and shape parameters. The estimated discharge return levels are compared with official peakflows from government sources, showing satisfying agreement. An importance score is calculated to rank the 130 covariates, showing that the most significant indicators of peakflows are the catchment drainage area, the geographical location and temperature-based bio-indicators. Our study contributes to the overall push in the hydrological research community to improve RFFA flood estimates and provides application at the continental scale, paving the way for global scale flood risk analysis.
Figure 1.
Station coverage and the 14 HydroBASINS level 2 regions for North America. Stations with only annual maximum data available are shown in blue, while stations with both types of data available are shown in red.
Figure 1.
Station coverage and the 14 HydroBASINS level 2 regions for North America. Stations with only annual maximum data available are shown in blue, while stations with both types of data available are shown in red.
Figure 2.
Discretization of the HydroBASIN region 9 into river networks (left) and catchments, for a small river network (right). The considered network is highlighted with a red contour.
Figure 2.
Discretization of the HydroBASIN region 9 into river networks (left) and catchments, for a small river network (right). The considered network is highlighted with a red contour.
Figure 3.
Bayesian hierarchical model structure.
Figure 3.
Bayesian hierarchical model structure.
Figure 4.
BHM model validation for regions 5 to 11, using a hold-out validation procedure. For validation sites, GEV distribution parameter estimated with the BHM model (x axis) are compared with their local estimates (y axis). The Pearson correlations for this comparison are also reported. A random half of the data is used for training and the other half for validation.
Figure 4.
BHM model validation for regions 5 to 11, using a hold-out validation procedure. For validation sites, GEV distribution parameter estimated with the BHM model (x axis) are compared with their local estimates (y axis). The Pearson correlations for this comparison are also reported. A random half of the data is used for training and the other half for validation.
Figure 5.
Normal quantile quantile plots for the three GEV distribution parameter residuals for all gauged data, regions 5 to 11.
Figure 5.
Normal quantile quantile plots for the three GEV distribution parameter residuals for all gauged data, regions 5 to 11.
Figure 6.
Spatial representation of the regression residuals, for the three GEV distribution parameters and for all gauged stations in regions 5 to 11. The residuals are classified into 4 groups splitted by the first, second and third quartile.
Figure 6.
Spatial representation of the regression residuals, for the three GEV distribution parameters and for all gauged stations in regions 5 to 11. The residuals are classified into 4 groups splitted by the first, second and third quartile.
Figure 7.
The top ten most significant covariates to predict GEV distribution parameters with the BHM model. The y-axis shows the covariate nomenclatures while the x-axis shows the importance score. The score is defined by summing the absolute value of the corresponding regression coefficients across all regions, after regional standardization and elimination of non-significant covariates.
Figure 7.
The top ten most significant covariates to predict GEV distribution parameters with the BHM model. The y-axis shows the covariate nomenclatures while the x-axis shows the importance score. The score is defined by summing the absolute value of the corresponding regression coefficients across all regions, after regional standardization and elimination of non-significant covariates.
Figure 8.
GEV distribution parameter estimates (mean of posterior distributions) for catchments in regions 5 to 11. Red, white and blue correspond to high, medium and low values respectively
Figure 8.
GEV distribution parameter estimates (mean of posterior distributions) for catchments in regions 5 to 11. Red, white and blue correspond to high, medium and low values respectively
Figure 9.
Comparison of GEV distribution parameter median estimates for the BHM model without daily data (x axis) and our BHM model with additional daily data (y axis). The comparison is done for all stations in region 9 where daily discharge data is available.
Figure 9.
Comparison of GEV distribution parameter median estimates for the BHM model without daily data (x axis) and our BHM model with additional daily data (y axis). The comparison is done for all stations in region 9 where daily discharge data is available.
Figure 10.
Comparison of GEV distribution parameter 95% confidence interval width for the BHM model without daily data (x axis) and our BHM model (with additional daily data). The comparison is done for all stations in region 9 where daily discharge data is available.
Figure 10.
Comparison of GEV distribution parameter 95% confidence interval width for the BHM model without daily data (x axis) and our BHM model (with additional daily data). The comparison is done for all stations in region 9 where daily discharge data is available.
Figure 11.
The 100 year return period discharge levels are plotted for catchments in regions 5 to 11. The three plots correspond to the median, lower end (0.1th quantile) and upper end (0.9th quantile) of the RP100 discharge distribution.
Figure 11.
The 100 year return period discharge levels are plotted for catchments in regions 5 to 11. The three plots correspond to the median, lower end (0.1th quantile) and upper end (0.9th quantile) of the RP100 discharge distribution.
Figure 12.
Comparison of BHM model estimated discharge with those from government sources (black and red dots for the US and Quebec respectively), for the 2, 10 and 100 year return periods, for regions 5 to 11.
Figure 12.
Comparison of BHM model estimated discharge with those from government sources (black and red dots for the US and Quebec respectively), for the 2, 10 and 100 year return periods, for regions 5 to 11.