Landslide Susceptibility Prediction Considering Spatio-Temporal Division Principle of Training/Testing Datasets in Machine Learning Models

被引：1

作者：

Huang F. ^{[1
,2
]}

Ouyang W. ^{[1
]}

Jiang S. ^{[1
]}

Fan X. ^{[2
]}

Lian Z. ^{[3
]}

Zhou C. ^{[1
]}

机构：

[1] School of Infrastructure Engineering, Nanchang University, Nanchang

[2] State Key Laboratory of Geohazard Prevention and Geoenvironment Protection, Chengdu University of Technology, Chengdu

[3] Wuhan Center, China Geological Survey, Wuhan

来源：

Diqiu Kexue - Zhongguo Dizhi Daxue Xuebao/Earth Science - Journal of China University of Geosciences | 2024年 / 49卷 / 05期

关键词：

engineering geology; landslide susceptibility; landslides; machine learning model; time series; training/testing dataset;

D O I：

10.3799/dqkx.2022.357

中图分类号：

学科分类号：

摘要：

In most of the landslide susceptibility prediction (LSP) models, the landslide-non landslide spatial datasets are divided into training/testing datasets according to the principle of spatial random, however, this spatial randomness division inevitably introduces uncertainties into LSP modelling. Theoretically, LSP modelling is based on past landslide inventories to predict the spatial probability of future landslides, which has significant time series characteristics rather than only spatial random characteristics. Therefore, we believe that it is necessary to divide spatial datasets into the model training/testing datasets based on the time series of landslide occurrence. Taking Wencheng County in China as an example, 11 types of environmental factors and 128 time-accurate landslides are obtained; Then, the landslide and non-landslide samples connected with environmental factors are divided into two different types of training/testing datasets according to the principles of landslide time series and spatial random, respectively. The division ratios of training/testing datasets are set as 9∶1, 8∶2, 7∶3, 6∶4 and 5∶5, respectively, to avoid the influences of different ratios on the LSP results. Thus, the training/testing datasets under 10 combined working conditions are obtained. Finally, several typical machine learning models, such as Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forest (RF), are then trained and tested to perform LSP and analyze their uncertainties. Results show that: (1) The LSP uncertainties performed by the time series-based SVM, MLP and RF models are slightly lower than those by spatial random-based models, which verifies the feasibility of dividing by time series; (2) The time series division of training/testing datasets is actually a“deterministic”case among the spatial random division, which is more consistent with the actual situation of landslides. Of course, it is also feasible to carry out spatial random division for training and testing datasets when lacking landslide occurrence time. © 2024 China University of Geosciences. All rights reserved.

引用

页码：1607 / 1618

页数：11

共 18 条

[1] Cao W. G., Pan D., Xu Z. J., Et al., Landslide Hazard Susceptibility Mapping in Henan Province: Comparison of Multiple Machine Learning Models, Bulletin of Geological Science and Technology, pp. 1-11, (2023)
[2] Chen W., Peng J. B., Hong H. Y., Et al., Landslide Susceptibility Modelling Using GIS - Based Machine Learning Techniques for Chongren County, Jiangxi Province, China, Science of the Total Environment, 626, pp. 1121-1135, (2018)
[3] Guo Y. H., Dou J., Xiang Z. L., Et al., Evaluation of Susceptibility of Wenchuan Coseismic Landslide Using Gradient Lifting Decision Trees and Random Forests Based on Optimal Negative Sample Sampling Strategy, Geological Science and Technology Bulletin, pp. 1-20, (2023)
[4] Huang F. M., Chen B., Mao D. X., Et al., Landslide Susceptibility Prediction Modeling and Interpret-ability Based on Self - Screening Deep Learning Model, Earth Science, 48, 5, pp. 1696-1710, (2023)
[5] Huang F.M., Chen J.W., Tang Z.P., Et al., Uncertainties of Landslide Susceptibility Prediction Due to Different Spatial Resolutions and Different Proportions of Training and Testing Datasets, Chinese Journal of Rock Mechanics and Engineering, 40, 6, pp. 1155-1169, (2021)
[6] Huang F. M., Hu S.Y., Yan X.Y., Et al., Landslide Susceptibility Prediction Modeling Based on Machine Learning and Identification of Main Control Factors, Bulletin of Geological Science and Technology, 41, 2, pp. 79-90, (2022)
[7] Huang F. M., Li J. F., Wang J. Y., Et al., Landslide Susceptibility Prediction Modeling Law Considering Suitability of Linear Environmental Factors and Different Machine Learning Models, Bulletin of Geological Science and Technology, 41, 2, pp. 44-59, (2022)
[8] Huang F. M., Ye Z., Jiang S. H., Et al., Uncertainty Study of Landslide Susceptibility Prediction Considering the Different Attribute Interval Numbers of Environmental Factors and Different Data - Based Models, CATENA, 202, (2021)
[9] Hussin H. Y., Zumpano V., Reichenbach P., Et al., Different Landslide Sampling Strategies in a Grid-Based Bi - Variate Statistical Susceptibility Model, Geomorphology, 253, pp. 508-523, (2016)
[10] Khanna K., Martha T. R., Roy P., Et al., Effect of Time and Space Partitioning Strategies of Samples on Regional Landslide Susceptibility Modelling, Landslides, 18, 6, pp. 2281-2294, (2021)

← 1 2 →