A solution and practice for combining multi-source heterogeneous data to construct enterprise knowledge graph

被引:1
|
作者
Yan, Chenwei [1 ,2 ]
Fang, Xinyue [3 ]
Huang, Xiaotong [1 ,2 ]
Guo, Chenyi [4 ]
Wu, Ji [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Natl Pilot Software Engn Sch, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Key Lab Trustworthy Distributed Comp & Serv BUPT, Minist Educ, Beijing, Peoples R China
[3] Tsinghua Univ, Sch Econ & Management, Beijing, Peoples R China
[4] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
FRONTIERS IN BIG DATA | 2023年 / 6卷
基金
中国国家自然科学基金;
关键词
knowledge graph construction; heterogeneous data; knowledge graph update; enterprise knowledge graph; graph database;
D O I
10.3389/fdata.2023.1278153
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The knowledge graph is one of the essential infrastructures of artificial intelligence. It is a challenge for knowledge engineering to construct a high-quality domain knowledge graph for multi-source heterogeneous data. We propose a complete process framework for constructing a knowledge graph that combines structured data and unstructured data, which includes data processing, information extraction, knowledge fusion, data storage, and update strategies, aiming to improve the quality of the knowledge graph and extend its life cycle. Specifically, we take the construction process of an enterprise knowledge graph as an example and integrate enterprise register information, litigation-related information, and enterprise announcement information to enrich the enterprise knowledge graph. For the unstructured text, we improve existing model to extract triples and the F1-score of our model reached 72.77%. The number of nodes and edges in our constructed enterprise knowledge graph reaches 1,430,000 and 3,170,000, respectively. Furthermore, for each type of multi-source heterogeneous data, we apply corresponding methods and strategies for information extraction and data storage and carry out a detailed comparative analysis of graph databases. From the perspective of practical use, the informative enterprise knowledge graph and its timely update can serve many actual business needs. Our proposed enterprise knowledge graph has been deployed in HuaRong RongTong (Beijing) Technology Co., Ltd. and is used by the staff as a powerful tool for corporate due diligence. The key features are reported and analyzed in the case study. Overall, this paper provides an easy-to-follow solution and practice for domain knowledge graph construction, as well as demonstrating its application in corporate due diligence.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Multi-source Heterogeneous Blockchain Data Quality Assessment Model
    Zhang, Ran
    Li, Su
    Ding, Junxiang
    Zhang, Chuanbao
    Du, Likuan
    Wang, Junlu
    WEB AND BIG DATA. APWEB-WAIM 2022 INTERNATIONAL WORKSHOPS, KGMA 2022, SEMIBDMA 2022, DEEPLUDA 2022, 2023, 1784 : 86 - 94
  • [42] Scene Classification Based on Heterogeneous Features of Multi-Source Data
    Xu, Chengjun
    Shu, Jingqian
    Zhu, Guobin
    REMOTE SENSING, 2023, 15 (02)
  • [43] Multi-source Heterogeneous Data Integration Technology and Its Development
    Wang, Yong
    Shi, Qi
    Song, Hongtao
    Li, Zhigang
    Chen, Xue
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PROMOTION OF INFORMATION TECHNOLOGY (ICPIT 2016), 2016, 66 : 133 - 138
  • [44] Multi-source heterogeneous data integration for incident likelihood analysis
    Kamil, Mohammad Zaid
    Khan, Faisal
    Amyotte, Paul
    Ahmed, Salim
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 185
  • [45] Federated Generative Model on Multi-Source Heterogeneous Data in IoT
    Xiong, Zuobin
    Li, Wei
    Cai, Zhipeng
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10537 - 10545
  • [46] A Collaborative Filtering Recommendation Algorithm for Multi-Source Heterogeneous Data
    Wu B.
    Lou Z.
    Ye Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (05): : 1034 - 1047
  • [47] Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data
    Xin YANG
    Qi Jing YAN
    Mi Xia WU
    Acta Mathematica Sinica,English Series, 2024, (11) : 2751 - 2770
  • [48] Multi-source heterogeneous data recognition based on linguistic labels
    Guo, Chen
    Chai, Yong
    Wang, Cong
    2016 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY PROCEEDINGS - CYBERC 2016, 2016, : 278 - 285
  • [49] Enhanced knowledge transfer for collaborative filtering with multi-source heterogeneous feedbacks
    Hongwei Zhang
    Xiangwei Kong
    Yujia Zhang
    Multimedia Tools and Applications, 2021, 80 : 24245 - 24270
  • [50] Enhanced knowledge transfer for collaborative filtering with multi-source heterogeneous feedbacks
    Zhang, Hongwei
    Kong, Xiangwei
    Zhang, Yujia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (16) : 24245 - 24270