Understanding Residential Address Patterns in Urban and Rural Areas: A Machine Learning Approach

被引:0
|
作者
Cruz, Paula [1 ,2 ]
Vanneschi, Leonardo [1 ]
Painho, Marco [1 ]
机构
[1] Univ Nova Lisboa, Nova Informat Management Sch NOVA IMS, Lisbon, Portugal
[2] Stat Portugal, Methodol & Informat Syst Dept, Lisbon, Portugal
关键词
address validation; census; data quality; machine learning; multiclass classification; statistical operations; CLASSIFICATION; ALGORITHMS; VALIDATION;
D O I
10.1111/tgis.70003
中图分类号
P9 [自然地理学]; K9 [地理];
学科分类号
0705 ; 070501 ;
摘要
Address data quality has a direct impact on demographic and other spatial analyses, since it may lead to uncertainty and potential bias. Most of the existing studies measure address quality through matching with reference databases, which can be an expensive and time-consuming process. To bridge this gap, we propose a multiclass classification algorithm to evaluate the syntactic quality of residential addresses from a large database without using external databases. Namely, we adopt a multi-objective optimization approach, based on the NSGA-II algorithm and two modified k-NN algorithms. The objective is to find the address components as well as the optimal number of neighboring examples that help explain which class (good, incorrect or incomplete and anomalous) the quality of an address belongs to, by type of region (urban, medium urban, and rural). The presented results indicate that the proposed approach outperforms the best baseline algorithms on multiclass classification, while also providing descriptive information on the most relevant features and median local neighborhood of each instance. With this study, we further extend previous research in the field of address pattern extraction, by explicitly differentiating urban and rural areas as well as invalid and anomalous addresses.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] HEALTH PRACTICES - PRENATAL-CARE PATTERNS IN RURAL AND URBAN AREAS
    ROBINE, JM
    MAGUIN, P
    NICAUD, V
    HATTON, F
    REVUE D EPIDEMIOLOGIE ET DE SANTE PUBLIQUE, 1985, 33 (03): : 203 - 211
  • [42] Spatial and temporal patterns of air pollutants in rural and urban areas of India
    Sharma, Disha
    Kulshrestha, U. C.
    ENVIRONMENTAL POLLUTION, 2014, 195 : 276 - 281
  • [43] Hospitalized pneumonia - Outcomes, treatment patterns, and costs in urban and rural areas
    Lave, JR
    Fine, MJ
    Sankey, SS
    Hanusa, BH
    Weissfeld, LA
    Kapoor, WN
    JOURNAL OF GENERAL INTERNAL MEDICINE, 1996, 11 (07) : 415 - 421
  • [44] MACHINE LEARNING APPROACHES TO UNDERSTANDING AND PREDICTING PATTERNS OF ADHERENCE
    Chakraborty, Shayok
    Bhattacharya, Aditya
    Tian, Shubo
    Roque, Nelson
    He, Zhe
    Boot, Walter
    INNOVATION IN AGING, 2021, 5 : 551 - 551
  • [45] FUZZY APPROACH OF RURAL AND URBAN AREAS DELIMITATION MODELING IN GIS
    Paszto, Vit
    Marek, Lukas
    Sedonik, Jiri
    12TH INTERNATIONAL MULTIDISCIPLINARY SCIENTIFIC GEOCONFERENCE, SGEM 2012, VOL. II, 2012, : 1049 - 1056
  • [46] Understanding CSR champions: a machine learning approach
    Bilokha, Alona
    Cheng, Mingying
    Fu, Mengchuan
    Hasan, Iftekhar
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [47] Estimating the importance of environmental factors influencing the urban heat island for urban areas in Greece. A machine learning approach
    Petrou, Ilias
    Kassomenos, Pavlos
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 368
  • [48] The energy rebound effect of residential buildings: Evidence from urban and rural areas in China
    Du, Qiang
    Han, Xiao
    Li, Yi
    Li, Zhe
    Xia, Bo
    Guo, Xiqian
    ENERGY POLICY, 2021, 153
  • [49] Understanding temporary residential mobility during urban renewal: Insights from a structured community survey and machine learning analysis
    Chao, Hao
    Xu, Minghui
    Jin, Scarlett T.
    Kong, Hui
    APPLIED GEOGRAPHY, 2024, 172
  • [50] RECOGNITION OF RURAL RESIDENTIAL AREAS IN MOUNTAINOUS REGIONS USING DEEP LEARNING METHOD
    Zheng, Lijuan
    Zhang, Wei
    Wang, Xinlei
    Ai, Ping
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1892 - 1895