Geospatial Entity Resolution

被引:4
|
作者
Balsebre, Pasquale [1 ]
Yao, Dezhong [2 ]
Cong, Gao [1 ]
Hai, Zhen [3 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[3] Alibaba Grp, DAMO Acad, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
Entity resolution; neural networks; geospatial data; neighbourhood embedding; graph attention;
D O I
10.1145/3485447.3512026
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A geospatial database is today at the core of an ever increasing number of services. Building and maintaining it remains challenging due to the need to merge information from multiple providers. Entity Resolution (ER) consists of finding entity mentions from different sources that refer to the same real world entity. In geospatial ER, entities are often represented using different schemes and are subject to incomplete information and inaccurate location, making ER and deduplication daunting tasks. While tremendous advances have been made in traditional entity resolution and natural language processing, geospatial data integration approaches still heavily rely on static similarity measures and human-designed rules. In order to achieve automatic linking of geospatial data, a unified representation of entities with heterogeneous attributes and their geographical context, is needed. To this end, we propose Geo-ER1, a joint framework that combines Transformer-based language models, that have been successfully applied in ER, with a novel learning-based architecture to represent the geospatial character of the entity. Different from existing solutions, Geo-ER does not rely on pre-defined rules and is able to capture information from surrounding entities in order to make context-based, accurate predictions. Extensive experiments on eight real world datasets demonstrate the effectiveness of our solution over state-of-the-art methods. Moreover, Geo-ER proves to be robust in settings where there is no available training data for a specific city.
引用
收藏
页码:3061 / 3070
页数:10
相关论文
共 50 条
  • [41] Collective Entity Resolution in Familial Networks
    Kouki, Pigi
    Pujara, Jay
    Marcum, Christopher
    Koehly, Laura
    Getoor, Lise
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 227 - 236
  • [42] Hierarchical Entity Resolution using an Oracle
    Galhotra, Sainyam
    Firmani, Donatella
    Saha, Barna
    Srivastava, Divesh
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 414 - 428
  • [43] A neural Entity Coreference Resolution review
    Stylianou, Nikolaos
    Vlahavas, Ioannis
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168
  • [44] Entity Resolution in a Big Data Framework
    Kejriwal, Mayank
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 4243 - 4244
  • [45] BEER: Blocking for Effective Entity Resolution
    Galhotra, Sainyam
    Firmani, Donatella
    Saha, Barna
    Srivastava, Divesh
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2711 - 2715
  • [46] Efficient Entity Resolution on Heterogeneous Records
    Lin, Yiming
    Wang, Hongzhi
    Li, Jianzhong
    Gao, Hong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (05) : 912 - 926
  • [47] Online Entity Resolution Using an Oracle
    Firmani, Donatella
    Saha, Barna
    Srivastava, Divesh
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (05): : 384 - 395
  • [48] Reviewing Basic Methods of Entity Resolution
    Gao G.
    Data Analysis and Knowledge Discovery, 2019, 3 (05) : 27 - 40
  • [49] Domain Adaptation for Deep Entity Resolution
    Tu, Jianhong
    Fan, Ju
    Tang, Nan
    Wang, Peng
    Chai, Chengliang
    Li, Guoliang
    Fan, Ruixue
    Du, Xiaoyong
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 443 - 457
  • [50] Document Analytics through Entity Resolution
    Santos, Joao
    Martins, Bruno
    Batista, David S.
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT II, 2013, 8181 : 531 - 534