A novel explainable PSO-XGBoost model for regional flood frequency analysis at a national scale: Exploring spatial heterogeneity in flood drivers

被引:4
|
作者
Kanani-Sadat, Yousef [1 ]
Safari, Abdolreza [1 ]
Nasseri, Mohsen [2 ]
Homayouni, Saeid [3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Surveying & Geospatial Engn, Tehran, Iran
[2] Univ Tehran, Coll Engn, Sch Civil Engn, Tehran, Iran
[3] Inst Natl Rech Sci, Ctr Eau Terre Environnment, Quebec City, PQ, Canada
关键词
Regional flood frequency analysis; Google Earth Engine; XGBoost; SHAP; Flood drivers; Spatial heterogeneity; SUPPORT VECTOR REGRESSION; PREDICTION; RAINFALL; SERIES; ERROR;
D O I
10.1016/j.jhydrol.2024.131493
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying flood drivers and accurately estimating design floods play a crucial role in fostering sustainable and effective planning and management strategies for mitigating flood risks. Regional Flood frequency Analysis (RFFA) is one of the most commonly used approaches to estimate design floods in ungauged watersheds. This study used XGBoost coupled with Particle Swarm Optimization (PSO) to estimate different quantiles of the design floods with return periods (from 2-year to 1000-year). After a preliminary assessment, 373 nationwide hydrometric stations were selected to conduct at-site flood frequency analysis by identifying the best-fitting distribution. Using the capabilities of GIS and Google Earth Engine (GEE), 83 independent features including different physiographical, geomorphological, land-use, soil types, and long-term hydro-climatic and environmental variables were extracted for the upstream watersheds. After fine-tuning the hyper-parameters of the XGBoost method for each flood quantile, the feature importance values were used to eliminate the insignificant features and refine the developed models. Additionally, classical methods such as Support Vector Regression (SVR) and Random Forest (RF) were implemented, to evaluate the XGBoost models efficiency. Different statistics demonstrated that the models effectively estimated flood quantiles, with the Nash-Sutcliffe Efficiency (NSE) varying from 0.709 to 0.840 across all models. A comparison of model performance reveals that the XGBoost method outperformed RF and SVR across all flood quantiles. Based on the developed models, design floods have been estimated for 949 stations across Iran. Furthermore, the Shapley additive explanation (SHAP) values were used to identify the main contributing features to model outputs and investigate the spatial heterogeneity of main flood drivers. According to the results, the perimeter and length of the watershed and heavy rainfall exhibit notably high importance compared to other features for all models. Based on the local SHAP values, in Northern, Northwestern, and Western basins, features associated with watershed sizes, such as perimeter, area and length exhibit the highest levels of importance. Moreover, the Southwest basins are more influenced by "heavy rainfall". These findings demonstrate the promise of the developed models for estimating flood quantiles across diverse environmental, geomorphological, and hydro-climatic conditions. This capability is valuable for sustainable watershed management, especially in environments with limited maximum discharge data.
引用
收藏
页数:25
相关论文
共 31 条
  • [1] Regional flood frequency analysis at the global scale
    Smith, Andrew
    Sampson, Christopher
    Bates, Paul
    WATER RESOURCES RESEARCH, 2015, 51 (01) : 539 - 553
  • [2] A regional Bayesian hierarchical model for flood frequency analysis
    Hongxiang Yan
    Hamid Moradkhani
    Stochastic Environmental Research and Risk Assessment, 2015, 29 : 1019 - 1036
  • [3] A regional Bayesian hierarchical model for flood frequency analysis
    Yan, Hongxiang
    Moradkhani, Hamid
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2015, 29 (03) : 1019 - 1036
  • [4] A regional Bayesian POT model for flood frequency analysis
    Ribatet, Mathieu
    Sauquet, Eric
    Gresillon, Jean-Michel
    Ouarda, Taha B. M. J.
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2007, 21 (04) : 327 - 339
  • [5] A regional Bayesian POT model for flood frequency analysis
    Mathieu Ribatet
    Eric Sauquet
    Jean-Michel Grésillon
    Taha B. M. J. Ouarda
    Stochastic Environmental Research and Risk Assessment, 2007, 21 : 327 - 339
  • [6] Regional Flood Frequency Analysis Using Spatial Proximity and Basin Characteristics
    Ahn, Kuk-Hyun
    Palmer, Richard
    WORLD ENVIRONMENTAL AND WATER RESOURCES CONGRESS 2016: WATERSHED MANAGEMENT, IRRIGATION AND DRAINAGE, AND WATER RESOURCES PLANNING AND MANAGEMENT, 2016, : 329 - 338
  • [7] Information gap analysis of flood model uncertainties and regional frequency analysis
    Hine, Daniel
    Hall, Jim W.
    WATER RESOURCES RESEARCH, 2010, 46
  • [8] A flood susceptibility model at the national scale based on multicriteria analysis
    Santos, Pedro Pinto
    Reis, Eusebio
    Pereira, Susana
    Santos, Monica
    SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 667 : 325 - 337
  • [9] Regional Flood Frequency Analysis Using an Artificial Neural Network Model
    Kordrostami, Sasan
    Alim, Mohammad A.
    Karim, Fazlul
    Rahman, Ataur
    GEOSCIENCES, 2020, 10 (04)
  • [10] Continental Scale Regional Flood Frequency Analysis: Combining Enhanced Datasets and a Bayesian Framework
    Alexandre, Duy Anh
    Chaudhuri, Chiranjib
    Gill-Fortin, Jasmin
    HYDROLOGY, 2024, 11 (08)