Understanding Residential Address Patterns in Urban and Rural Areas: A Machine Learning Approach

被引:0
|
作者
Cruz, Paula [1 ,2 ]
Vanneschi, Leonardo [1 ]
Painho, Marco [1 ]
机构
[1] Univ Nova Lisboa, Nova Informat Management Sch NOVA IMS, Lisbon, Portugal
[2] Stat Portugal, Methodol & Informat Syst Dept, Lisbon, Portugal
关键词
address validation; census; data quality; machine learning; multiclass classification; statistical operations; CLASSIFICATION; ALGORITHMS; VALIDATION;
D O I
10.1111/tgis.70003
中图分类号
P9 [自然地理学]; K9 [地理];
学科分类号
0705 ; 070501 ;
摘要
Address data quality has a direct impact on demographic and other spatial analyses, since it may lead to uncertainty and potential bias. Most of the existing studies measure address quality through matching with reference databases, which can be an expensive and time-consuming process. To bridge this gap, we propose a multiclass classification algorithm to evaluate the syntactic quality of residential addresses from a large database without using external databases. Namely, we adopt a multi-objective optimization approach, based on the NSGA-II algorithm and two modified k-NN algorithms. The objective is to find the address components as well as the optimal number of neighboring examples that help explain which class (good, incorrect or incomplete and anomalous) the quality of an address belongs to, by type of region (urban, medium urban, and rural). The presented results indicate that the proposed approach outperforms the best baseline algorithms on multiclass classification, while also providing descriptive information on the most relevant features and median local neighborhood of each instance. With this study, we further extend previous research in the field of address pattern extraction, by explicitly differentiating urban and rural areas as well as invalid and anomalous addresses.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] The potential of City Information Modeling (CIM) in Understanding and Learning from the Impact of Urban Regulations on Residential Areas in Romania
    Ungureanu, Teodora
    NEW TECHNOLOGIES AND REDESIGNING LEARNING SPACES, VOL I, 2019, : 422 - 428
  • [22] Social mixing patterns in rural and urban areas of southern China
    Read, Jonathan M.
    Lessler, Justin
    Riley, Steven
    Wang, Shuying
    Tan, Li Jiu
    Kwok, Kin On
    Guan, Yi
    Jiang, Chao Qiang
    Cummings, Derek A. T.
    PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 281 (1785)
  • [23] Dietary patterns in Tanzania's transitioning rural and urban areas
    Paulo, Linda Simon
    Lenters, Virissa C.
    Chillo, Pilly
    Wanjohi, Milka
    Piedade, Goncalo J.
    Mende, Daniel R.
    Harris, Vanessa
    Kamuhabwa, Appolinary
    Kwesigabo, Gideon
    Asselbergs, Folkert W.
    Klipstein-Grobusch, K.
    JOURNAL OF HEALTH POPULATION AND NUTRITION, 2025, 44 (01)
  • [24] Identification of urban-rural integration types in China - an unsupervised machine learning approach
    Zeng, Qiyan
    Chen, Xiaofu
    CHINA AGRICULTURAL ECONOMIC REVIEW, 2023, 15 (02) : 400 - 415
  • [25] Predicting and understanding residential water use with interpretable machine learning
    Rachunok, Benjamin
    Verma, Aniket
    Fletcher, Sarah
    ENVIRONMENTAL RESEARCH LETTERS, 2024, 19 (01)
  • [26] How accurately does geocoding determine residential locations in rural and urban areas?
    Ward, MH
    Giglierano, J
    Wolter, C
    Miller, RS
    Nuckols, JR
    Hartge, P
    EPIDEMIOLOGY, 2003, 14 (05) : S102 - S103
  • [27] Understanding the effect of spatial patterns on the vulnerability of urban areas to flooding
    Sakieh, Yousef
    INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION, 2017, 25 : 125 - 136
  • [28] Understanding heat patterns produced by vehicular flows in urban areas
    Zhu, Rui
    Wong, Man Sing
    Guilbert, Eric
    Chan, Pak-Wai
    SCIENTIFIC REPORTS, 2017, 7
  • [29] Understanding heat patterns produced by vehicular flows in urban areas
    Rui Zhu
    Man Sing Wong
    Éric Guilbert
    Pak-Wai Chan
    Scientific Reports, 7
  • [30] Understanding Multi-Vehicle Collision Patterns on Freeways-A Machine Learning Approach
    Morris, Clint
    Yang, Jidong J.
    INFRASTRUCTURES, 2020, 5 (08)