A novel explainable PSO-XGBoost model for regional flood frequency analysis at a national scale: Exploring spatial heterogeneity in flood drivers

被引:4
|
作者
Kanani-Sadat, Yousef [1 ]
Safari, Abdolreza [1 ]
Nasseri, Mohsen [2 ]
Homayouni, Saeid [3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Surveying & Geospatial Engn, Tehran, Iran
[2] Univ Tehran, Coll Engn, Sch Civil Engn, Tehran, Iran
[3] Inst Natl Rech Sci, Ctr Eau Terre Environnment, Quebec City, PQ, Canada
关键词
Regional flood frequency analysis; Google Earth Engine; XGBoost; SHAP; Flood drivers; Spatial heterogeneity; SUPPORT VECTOR REGRESSION; PREDICTION; RAINFALL; SERIES; ERROR;
D O I
10.1016/j.jhydrol.2024.131493
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying flood drivers and accurately estimating design floods play a crucial role in fostering sustainable and effective planning and management strategies for mitigating flood risks. Regional Flood frequency Analysis (RFFA) is one of the most commonly used approaches to estimate design floods in ungauged watersheds. This study used XGBoost coupled with Particle Swarm Optimization (PSO) to estimate different quantiles of the design floods with return periods (from 2-year to 1000-year). After a preliminary assessment, 373 nationwide hydrometric stations were selected to conduct at-site flood frequency analysis by identifying the best-fitting distribution. Using the capabilities of GIS and Google Earth Engine (GEE), 83 independent features including different physiographical, geomorphological, land-use, soil types, and long-term hydro-climatic and environmental variables were extracted for the upstream watersheds. After fine-tuning the hyper-parameters of the XGBoost method for each flood quantile, the feature importance values were used to eliminate the insignificant features and refine the developed models. Additionally, classical methods such as Support Vector Regression (SVR) and Random Forest (RF) were implemented, to evaluate the XGBoost models efficiency. Different statistics demonstrated that the models effectively estimated flood quantiles, with the Nash-Sutcliffe Efficiency (NSE) varying from 0.709 to 0.840 across all models. A comparison of model performance reveals that the XGBoost method outperformed RF and SVR across all flood quantiles. Based on the developed models, design floods have been estimated for 949 stations across Iran. Furthermore, the Shapley additive explanation (SHAP) values were used to identify the main contributing features to model outputs and investigate the spatial heterogeneity of main flood drivers. According to the results, the perimeter and length of the watershed and heavy rainfall exhibit notably high importance compared to other features for all models. Based on the local SHAP values, in Northern, Northwestern, and Western basins, features associated with watershed sizes, such as perimeter, area and length exhibit the highest levels of importance. Moreover, the Southwest basins are more influenced by "heavy rainfall". These findings demonstrate the promise of the developed models for estimating flood quantiles across diverse environmental, geomorphological, and hydro-climatic conditions. This capability is valuable for sustainable watershed management, especially in environments with limited maximum discharge data.
引用
收藏
页数:25
相关论文
共 31 条
  • [11] Bayesian regional flood frequency analysis with GEV hierarchical models under spatial dependency structures
    Sampaio, Julio
    Costa, Veber
    HYDROLOGICAL SCIENCES JOURNAL, 2021, 66 (03) : 422 - 433
  • [12] Usefulness of the reversible jump Markov chain Monte Carlo model in regional flood frequency analysis
    Ribatet, M.
    Sauquet, E.
    Gresillon, J. M.
    Ouarda, T. B. M. J.
    WATER RESOURCES RESEARCH, 2007, 43 (08)
  • [13] Regional flood frequency analysis based on a Weibull model: Part 2. Simulations and applications
    Heo, JH
    Salas, JD
    Boes, DC
    JOURNAL OF HYDROLOGY, 2001, 242 (3-4) : 171 - 182
  • [14] Selection of a basin-scale model for flood frequency analysis in Mahanadi river basin, India
    Swetapadma, Sonali
    Ojha, C. S. P.
    NATURAL HAZARDS, 2020, 102 (01) : 519 - 552
  • [15] Selection of a basin-scale model for flood frequency analysis in Mahanadi river basin, India
    Sonali Swetapadma
    C. S. P. Ojha
    Natural Hazards, 2020, 102 : 519 - 552
  • [16] Improving at-site flood frequency analysis with additional spatial information: a probabilistic regional envelope curve approach
    Lam, Daryl
    Thompson, Chris
    Croke, Jacky
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2017, 31 (08) : 2011 - 2031
  • [17] Improving at-site flood frequency analysis with additional spatial information: a probabilistic regional envelope curve approach
    Daryl Lam
    Chris Thompson
    Jacky Croke
    Stochastic Environmental Research and Risk Assessment, 2017, 31 : 2011 - 2031
  • [18] Local and regional flood frequency analysis based on hierarchical Bayesian model in Dongting Lake Basin, China
    Wu, Yun-biao
    Xue, Lian-qing
    Liu, Yuan-hong
    WATER SCIENCE AND ENGINEERING, 2019, 12 (04) : 253 - 262
  • [19] Local and regional flood frequency analysis based on hierarchical Bayesian model in Dongting Lake Basin,China
    Yun-biao Wu
    Lian-qing Xue
    Yuan-hong Liu
    Water Science and Engineering, 2019, 12 (04) : 253 - 262
  • [20] Regional flood frequency analysis based on a Weibull model: Part 1. Estimation and asymptotic variances
    Heo, JH
    Boes, DC
    Salas, JD
    JOURNAL OF HYDROLOGY, 2001, 242 (3-4) : 157 - 170