Investigating landslide data balancing for susceptibility mapping using generative and machine learning models

被引:0
|
作者
Jiang, Yuhang [1 ,2 ]
Wang, Wei [1 ,2 ]
Zou, Lifang [3 ]
Cao, Yajun [1 ,2 ]
Xie, Wei-Chau [4 ]
机构
[1] Hohai Univ, Geotech Res Inst, Nanjing 210098, Jiangsu, Peoples R China
[2] Hohai Univ, Key Lab Minist Educ Geomech & Embankment Engn, Nanjing 210098, Jiangsu, Peoples R China
[3] Hohai Univ, Sch Earth Sci & Engn, Nanjing 211100, Jiangsu, Peoples R China
[4] Univ Waterloo, Dept Civil & Environm Engn, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
基金
中国国家自然科学基金;
关键词
Landslide susceptibility mapping; Conditional Tabular Generative Adversarial Networks; Convolutional Neural Network; Long Short-Term Memory Neural Network; Self-training semi-supervised SVM algorithm; NETWORK;
D O I
10.1007/s10346-024-02352-3
中图分类号
P5 [地质学];
学科分类号
0709 ; 081803 ;
摘要
With the development and application of machine learning, significant advances have been made in landslide susceptibility mapping. However, due to challenges in actual field landslide investigations, current landslide susceptibility mapping is usually characterized by insufficient landslide samples (positive samples) and low reliability of non-landslide samples (negative samples). Considering Lianghe County in Yunnan Province, China, as an example, this paper aims to research the effectiveness of three oversampling models in generating positive samples for landslides: Conditional Tabular Generative Adversarial Networks (CTGAN), Generative Adversarial Networks (GAN), and the traditional Synthetic Minority Oversampling Technique (SMOTE) algorithms. Additionally, three machine learning methods, including 1D Convolutional Neural Network-Long Short-Term Memory Neural Network (CNN-LSTM), Random Forest (RF), and Gradient Boosting Decision Tree (GBDT) classifiers, are used for landslide susceptibility assessment. We also devise a non-landslide data (negative samples) screening method utilizing a self-trained support vector machine within a semi-supervised framework. The results show that by training on the dataset after negative sample screening, the AUC values for the 1D-CNN-LSTM, RF, and GBDT models have shown significant improvement, increasing from (0.778, 0.869, 0.849) to (0.837, 0.936, 0.877). Compared with the original training set, the prediction accuracy of the three machine learning models is improved after training on the augmented data by CTGAN, GAN, and SMOTE models. The RF model, augmented with 200 positive samples generated by CTGAN, achieves the highest prediction accuracy in the study (AUC = 0.962). The 1D CNN-LSTM model achieves its highest prediction accuracy (AUC = 0.953) when augmented with 200 positive samples from GAN. Similarly, the GBDT model reaches its highest prediction accuracy (AUC = 0.928) when augmented with 200 positive samples created by SMOTE. In addition, the spatial distribution of data indicates that the data generated by the generative adversarial model exhibits higher diversity, which can be used for landslide susceptibility assessment.
引用
收藏
页码:189 / 204
页数:16
相关论文
共 50 条
  • [31] Landslide Susceptibility Mapping: Machine and Ensemble Learning Based on Remote Sensing Big Data
    Kalantar, Bahareh
    Ueda, Naonori
    Saeidi, Vahideh
    Ahmadi, Kourosh
    Halin, Alfian Abdul
    Shabani, Farzin
    REMOTE SENSING, 2020, 12 (11)
  • [32] An Integrated Approach of Machine Learning, Remote Sensing, and GIS Data for the Landslide Susceptibility Mapping
    Ullah, Israr
    Aslam, Bilal
    Shah, Syed Hassan Iqbal Ahmad
    Tariq, Aqil
    Qin, Shujing
    Majeed, Muhammad
    Havenith, Hans-Balder
    LAND, 2022, 11 (08)
  • [33] Susceptibility mapping of groundwater salinity using machine learning models
    Amirhosein Mosavi
    Farzaneh Sajedi Hosseini
    Bahram Choubin
    Fereshteh Taromideh
    Marzieh Ghodsi
    Bijan Nazari
    Adrienn A. Dineva
    Environmental Science and Pollution Research, 2021, 28 : 10804 - 10817
  • [34] Susceptibility mapping of groundwater salinity using machine learning models
    Mosavi, Amirhosein
    Sajedi Hosseini, Farzaneh
    Choubin, Bahram
    Taromideh, Fereshteh
    Ghodsi, Marzieh
    Nazari, Bijan
    Dineva, Adrienn A.
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (09) : 10804 - 10817
  • [35] Landslide susceptibility mapping using deep learning models in Ardabil province, Iran
    Hossein Hamedi
    Ali Asghar Alesheikh
    Mahdi Panahi
    Saro Lee
    Stochastic Environmental Research and Risk Assessment, 2022, 36 : 4287 - 4310
  • [36] Landslide susceptibility mapping using deep learning models in Ardabil province, Iran
    Hamedi, Hossein
    Alesheikh, Ali Asghar
    Panahi, Mahdi
    Lee, Saro
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2022, 36 (12) : 4287 - 4310
  • [37] Investigating the efficacy of physics-based metaheuristic algorithms in combination with explainable ensemble machine-learning models for landslide susceptibility mapping
    Razavi-Termeh, Seyed Vahid
    Sadeghi-Niaraki, Abolghasem
    Naqvi, Rizwan Ali
    Choi, Soo-Mi
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2025, 39 (03) : 1109 - 1141
  • [38] Machine Learning Feature Selection Methods for Landslide Susceptibility Mapping
    Micheletti, Natan
    Foresti, Loris
    Robert, Sylvain
    Leuenberger, Michael
    Pedrazzini, Andrea
    Jaboyedoff, Michel
    Kanevski, Mikhail
    MATHEMATICAL GEOSCIENCES, 2014, 46 (01) : 33 - 57
  • [39] Recent Developments in Machine Learning Applications in Landslide Susceptibility Mapping
    Lun, Na Kai
    Liew, Mohd Shahir
    Matori, Abdul Nasir
    Zawawi, Noor Amila Wan Abdullah
    13TH IMT-GT INTERNATIONAL CONFERENCE ON MATHEMATICS, STATISTICS AND THEIR APPLICATIONS (ICMSA2017), 2017, 1905
  • [40] Machine Learning Feature Selection Methods for Landslide Susceptibility Mapping
    Natan Micheletti
    Loris Foresti
    Sylvain Robert
    Michael Leuenberger
    Andrea Pedrazzini
    Michel Jaboyedoff
    Mikhail Kanevski
    Mathematical Geosciences, 2014, 46 : 33 - 57