Effects of auxiliary and ancillary data on LULC classification in a heterogeneous environment using optimized random forest algorithm

被引:11
|
作者
Kavzoglu, Taskin [1 ]
Bilucan, Furkan [1 ]
机构
[1] Gebze Tech Univ, Dept Geomat Engn, TR-41400 Gebze, Turkey
关键词
Auxiliary data; Ancillary data; Genetic algorithm; HSIC-Lasso; Relief-F; Feature selection; LAND-COVER; PERFORMANCE; IMAGERY; INDEX; SELECTION; ACCURACY; FEATURES;
D O I
10.1007/s12145-022-00874-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Land use and land cover (LULC) maps, providing crucial information for monitoring the Earth's surface, are one of the most essential products for numerous studies. Using only the spectral information in the classification process might cause poor performances in the areas with heterogeneous landscape characteristics. To overcome this problem, auxiliary and ancillary data are usually employed to improve classification accuracy. The objective of this study is to integrate auxiliary data (topographic and climatic features) and ancillary data (spectral indices and texture measures) into spectral bands of Sentinel-2A imagery and evaluate the performances of advanced feature selection methods. In this context, genetic algorithm-based random forest (GA-RF), HSIC-Lasso, and Relief-F feature selection approaches were utilized to determine the most informative features for the classification process from a high-dimensional dataset consisting of 102 features. Whilst the GA-RF algorithm selected 65 features, HSIC-Lasso chose 38 features, and Relief-F determined 51 features as ideal subsets. These feature subsets together with the whole data were inputted into a supervised classification process using the random forest (RF) classifier, whose parameters were optimized using random search algorithm. The highest overall accuracy of the produced thematic maps was estimated as 91.05% for the subset determined by the HSIC-Lasso algorithm, which was also the fastest algorithm (5.71 s). McNemar's statistical significance test confirmed the superiority of the HSIC-Lasso method over the GA-RF and Relief-F algorithms. SHapley Additive exPlanations method was also applied to analyze the relative importance of a feature according to the model output.
引用
收藏
页码:415 / 435
页数:21
相关论文
共 50 条
  • [1] Effects of auxiliary and ancillary data on LULC classification in a heterogeneous environment using optimized random forest algorithm
    Taskin Kavzoglu
    Furkan Bilucan
    Earth Science Informatics, 2023, 16 : 415 - 435
  • [2] Smart meter data classification using optimized random forest algorithm
    Zakariazadeh, Alireza
    ISA TRANSACTIONS, 2022, 126 : 361 - 369
  • [3] Random forest algorithm for classification of multiwavelength data
    Gao, Dan
    Zhang, Yan-Xia
    Zhao, Yong-Heng
    RESEARCH IN ASTRONOMY AND ASTROPHYSICS, 2009, 9 (02) : 220 - 226
  • [4] Random forest algorithm for classification of multiwavelength data
    Dan Gao1
    2 Graduate University of Chinese Academy of Sciences
    ResearchinAstronomyandAstrophysics, 2009, 9 (02) : 220 - 226
  • [5] Random forest algorithm in big data environment
    Liu, Yingchun
    Computer Modelling and New Technologies, 2014, 18 (12): : 147 - 151
  • [6] Classification of attention levels using a Random Forest algorithm optimized with Particle Swarm Optimization
    Guadalupe Bedolla-Ibarra, Maria
    del Carmen Cabrera-Hernandez, Maria
    Antonio Aceves-Fernandez, Marco
    Tovar-Arriaga, Saul
    EVOLVING SYSTEMS, 2022, 13 (05) : 687 - 702
  • [7] Classification of attention levels using a Random Forest algorithm optimized with Particle Swarm Optimization
    María Guadalupe Bedolla-Ibarra
    Maria del Carmen Cabrera-Hernandez
    Marco Antonio Aceves-Fernández
    Saul Tovar-Arriaga
    Evolving Systems, 2022, 13 : 687 - 702
  • [8] Random Forest Algorithm for Linked Data Using a Parallel Processing Environment
    Jeon, Dongkyu
    Kim, Wooju
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (02) : 372 - 380
  • [9] Load Forecasting Based on Optimized Random Forest Algorithm in Cloud Environment
    Sui, Xin
    Zhao, Hailong
    Xu, Honghua
    Song, Xiaolong
    Liu, Dan
    Journal of Computers (Taiwan), 2024, 35 (03) : 13 - 26
  • [10] Object-based classification of hyperspectral data using Random Forest algorithm
    Amini, Saeid
    Homayouni, Saeid
    Safari, Abdolreza
    Darvishsefat, Ali A.
    GEO-SPATIAL INFORMATION SCIENCE, 2018, 21 (02) : 127 - 138