A novel explainable PSO-XGBoost model for regional flood frequency analysis at a national scale: Exploring spatial heterogeneity in flood drivers

被引:4
|
作者
Kanani-Sadat, Yousef [1 ]
Safari, Abdolreza [1 ]
Nasseri, Mohsen [2 ]
Homayouni, Saeid [3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Surveying & Geospatial Engn, Tehran, Iran
[2] Univ Tehran, Coll Engn, Sch Civil Engn, Tehran, Iran
[3] Inst Natl Rech Sci, Ctr Eau Terre Environnment, Quebec City, PQ, Canada
关键词
Regional flood frequency analysis; Google Earth Engine; XGBoost; SHAP; Flood drivers; Spatial heterogeneity; SUPPORT VECTOR REGRESSION; PREDICTION; RAINFALL; SERIES; ERROR;
D O I
10.1016/j.jhydrol.2024.131493
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying flood drivers and accurately estimating design floods play a crucial role in fostering sustainable and effective planning and management strategies for mitigating flood risks. Regional Flood frequency Analysis (RFFA) is one of the most commonly used approaches to estimate design floods in ungauged watersheds. This study used XGBoost coupled with Particle Swarm Optimization (PSO) to estimate different quantiles of the design floods with return periods (from 2-year to 1000-year). After a preliminary assessment, 373 nationwide hydrometric stations were selected to conduct at-site flood frequency analysis by identifying the best-fitting distribution. Using the capabilities of GIS and Google Earth Engine (GEE), 83 independent features including different physiographical, geomorphological, land-use, soil types, and long-term hydro-climatic and environmental variables were extracted for the upstream watersheds. After fine-tuning the hyper-parameters of the XGBoost method for each flood quantile, the feature importance values were used to eliminate the insignificant features and refine the developed models. Additionally, classical methods such as Support Vector Regression (SVR) and Random Forest (RF) were implemented, to evaluate the XGBoost models efficiency. Different statistics demonstrated that the models effectively estimated flood quantiles, with the Nash-Sutcliffe Efficiency (NSE) varying from 0.709 to 0.840 across all models. A comparison of model performance reveals that the XGBoost method outperformed RF and SVR across all flood quantiles. Based on the developed models, design floods have been estimated for 949 stations across Iran. Furthermore, the Shapley additive explanation (SHAP) values were used to identify the main contributing features to model outputs and investigate the spatial heterogeneity of main flood drivers. According to the results, the perimeter and length of the watershed and heavy rainfall exhibit notably high importance compared to other features for all models. Based on the local SHAP values, in Northern, Northwestern, and Western basins, features associated with watershed sizes, such as perimeter, area and length exhibit the highest levels of importance. Moreover, the Southwest basins are more influenced by "heavy rainfall". These findings demonstrate the promise of the developed models for estimating flood quantiles across diverse environmental, geomorphological, and hydro-climatic conditions. This capability is valuable for sustainable watershed management, especially in environments with limited maximum discharge data.
引用
收藏
页数:25
相关论文
共 31 条
  • [21] Regional Frequency Analysis Based on Precipitation Regionalization Accounting for Temporal Variability and a Nonstationary Index Flood Model
    Gao, Qianyu
    Li, Guofang
    Bao, Jin
    Wang, Jian
    WATER RESOURCES MANAGEMENT, 2021, 35 (13) : 4435 - 4456
  • [22] Regional Frequency Analysis Based on Precipitation Regionalization Accounting for Temporal Variability and a Nonstationary Index Flood Model
    Qianyu Gao
    Guofang Li
    Jin Bao
    Jian Wang
    Water Resources Management, 2021, 35 : 4435 - 4456
  • [23] Multi-scale spatial sensitivity analysis of a model for economic appraisal of flood risk management policies
    Saint-Geours, Nathalie
    Bailly, Jean-Stephane
    Grelot, Frederic
    Lavergne, Christian
    ENVIRONMENTAL MODELLING & SOFTWARE, 2014, 60 : 153 - 166
  • [24] Regional flood frequency and spatial patterns analysis in the Pearl River Delta region using L-moments approach
    Yang, Tao
    Xu, Chong-Yu
    Shao, Quan-Xi
    Chen, Xi
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2010, 24 (02) : 165 - 182
  • [25] Regional flood frequency and spatial patterns analysis in the Pearl River Delta region using L-moments approach
    Tao Yang
    Chong-Yu Xu
    Quan-Xi Shao
    Xi Chen
    Stochastic Environmental Research and Risk Assessment, 2010, 24 : 165 - 182
  • [26] Development of a convolutional neural network based regional flood frequency analysis model for South-east Australia
    Afrin, Nilufa
    Ahamed, Farhad
    Rahman, Ataur
    NATURAL HAZARDS, 2024, 120 (12) : 11349 - 11376
  • [27] Regional flood frequency analysis using spatial proximity and basin characteristics: Quantile regression vs. parameter regression technique
    Ahn, Kuk-Hyun
    Palmer, Richard
    JOURNAL OF HYDROLOGY, 2016, 540 : 515 - 526
  • [28] Comparison between Quantile Regression Technique and Generalised Additive Model for Regional Flood Frequency Analysis: A Case Study for Victoria, Australia
    Noor, Farhana
    Laz, Orpita U.
    Haddad, Khaled
    Alim, Mohammad A.
    Rahman, Ataur
    WATER, 2022, 14 (22)
  • [29] Ensemble machine learning (EML) based regional flood frequency analysis model development and testing for south-east Australia
    Afrin, Nilufa
    Rahman, Ataur
    Sharafati, Ahmad
    Ahamed, Farhad
    Haddad, Khaled
    JOURNAL OF HYDROLOGY-REGIONAL STUDIES, 2025, 59
  • [30] Generalised Additive Model-Based Regional Flood Frequency Analysis: Parameter Regression Technique Using Generalised Extreme Value Distribution
    Rima, Laura
    Haddad, Khaled
    Rahman, Ataur
    WATER, 2025, 17 (02)