A Novel Data Cleaning Framework Based on Knowledge Graph

被引:0
|
作者
Song, Yuanfeng [1 ]
Zhang, Danni [2 ]
Li, Xiaodong [1 ]
Luo, Kunming [3 ]
Liao, Jianming [3 ]
机构
[1] Univ Elect Sci & Technol China, Informat Ctr, Chengdu, Peoples R China
[2] Southwest Jiaotong Univ, Informatizat & Network Managernent Off, Chengdu, Peoples R China
[3] Univ Elect Sci & Technol China, Sch CSE, Chengdu, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
data cleaning; knowledge graph; error repair; knowledge inference; EDITING RULES; FIXES;
D O I
10.1109/BigCom57025.2022.00050
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In real-world applications, data cleaning has long been a challenge across both academia and industry. Unsuccessful cleaning of data may lead to inaccurate analysis and untrustworthy decision-making. This paper proposes a novel knowledge graph-based data cleaning framework The framework performs pattern repair and inference repair on dirty data based on the obtained implicit and explicit relationships by establishing the knowledge graph and the relationship patterns in the data The pattern repair includes both explicit and implicit relationship matching, while the inference repair includes both attribution inference repair and rule inference repair. The experimental results show that the higher the number of association relations among data tables, the greater the improvement in cleaning efficiency; moreover, the more association knowledge is contained in the knowledge graph, the more obvious the improvement of cleaning efficiency.
引用
收藏
页码:350 / 355
页数:6
相关论文
共 50 条
  • [1] A Novel General RFID Framework based on Middleware and Data Cleaning
    Ye, Lu
    FGCN: PROCEEDINGS OF THE 2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING, VOLS 1 AND 2, 2008, : 192 - 195
  • [2] Data cleaning method of resource management knowledge graph
    Zhang, Yang
    Ding, Xiangqian
    Yu, Shusong
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 311 - 319
  • [3] A knowledge graph-based data harmonization framework for secondary data reuse
    Abad-Navarro, Francisco
    Martinez-Costa, Catalina
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 243
  • [4] A novel data-driven robust framework based on machine learning and knowledge graph for disease classification
    Lei, Zhenfeng
    Sun, Yuan
    Nanehkaran, Y. A.
    Yang, Shuangyuan
    Islam, Md Saiful
    Lei, Huiqing
    Zhang, Defu
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 102 (102): : 534 - 548
  • [5] A Knowledge Graph Framework for Dementia Research Data
    Timon-Reina, Santiago
    Rincon, Mariano
    Martinez-Tomas, Rafael
    Kirsebom, Bjorn-Eivind
    Fladby, Tormod
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [6] A Knowledge Graph-Based Data Integration Framework Applied to Battery Data Management
    Kalayci, Tahir Emre
    Bricelj, Bor
    Lah, Marko
    Pichler, Franz
    Scharrer, Matthias K.
    Rubesa-Zrim, Jelena
    SUSTAINABILITY, 2021, 13 (03) : 1 - 17
  • [7] Advanced Technology Evolution Pathways of Nanogenerators: A Novel Framework Based on Multi-Source Data and Knowledge Graph
    Liu, Yufei
    Wang, Guan
    Zhou, Yuan
    Liu, Yuhan
    NANOMATERIALS, 2022, 12 (05)
  • [8] A novel framework of knowledge transfer system for construction projects based on knowledge graph and transfer learning
    Xu, Jin
    He, Mengqi
    Jiang, Ying
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 199
  • [9] Data-Driven Strain Sensor Design Based on a Knowledge Graph Framework
    Ke, Junmin
    Liu, Furong
    Xu, Guofeng
    Liu, Ming
    SENSORS, 2024, 24 (17)
  • [10] Knowledge Graph-Based Query Rewriting in a Relational Data Harmonization Framework
    Abolhassani, Neda
    Tung, Teresa
    Gomadam, Karthik
    Ramaswamy, Lakshmish
    2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 433 - 438