A solution and practice for combining multi-source heterogeneous data to construct enterprise knowledge graph

被引:1
|
作者
Yan, Chenwei [1 ,2 ]
Fang, Xinyue [3 ]
Huang, Xiaotong [1 ,2 ]
Guo, Chenyi [4 ]
Wu, Ji [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Natl Pilot Software Engn Sch, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Key Lab Trustworthy Distributed Comp & Serv BUPT, Minist Educ, Beijing, Peoples R China
[3] Tsinghua Univ, Sch Econ & Management, Beijing, Peoples R China
[4] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
FRONTIERS IN BIG DATA | 2023年 / 6卷
基金
中国国家自然科学基金;
关键词
knowledge graph construction; heterogeneous data; knowledge graph update; enterprise knowledge graph; graph database;
D O I
10.3389/fdata.2023.1278153
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The knowledge graph is one of the essential infrastructures of artificial intelligence. It is a challenge for knowledge engineering to construct a high-quality domain knowledge graph for multi-source heterogeneous data. We propose a complete process framework for constructing a knowledge graph that combines structured data and unstructured data, which includes data processing, information extraction, knowledge fusion, data storage, and update strategies, aiming to improve the quality of the knowledge graph and extend its life cycle. Specifically, we take the construction process of an enterprise knowledge graph as an example and integrate enterprise register information, litigation-related information, and enterprise announcement information to enrich the enterprise knowledge graph. For the unstructured text, we improve existing model to extract triples and the F1-score of our model reached 72.77%. The number of nodes and edges in our constructed enterprise knowledge graph reaches 1,430,000 and 3,170,000, respectively. Furthermore, for each type of multi-source heterogeneous data, we apply corresponding methods and strategies for information extraction and data storage and carry out a detailed comparative analysis of graph databases. From the perspective of practical use, the informative enterprise knowledge graph and its timely update can serve many actual business needs. Our proposed enterprise knowledge graph has been deployed in HuaRong RongTong (Beijing) Technology Co., Ltd. and is used by the staff as a powerful tool for corporate due diligence. The key features are reported and analyzed in the case study. Overall, this paper provides an easy-to-follow solution and practice for domain knowledge graph construction, as well as demonstrating its application in corporate due diligence.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Inferring, Summarizing and Mining Multi-source Graph Data
    Koutra, Danai
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 978 - 978
  • [22] Building Multi-Source Semantic Knowledge Graph for Drug Repositioning
    Han Z.
    Xinyu A.
    Chunhe L.
    Data Analysis and Knowledge Discovery, 2022, 6 (07) : 87 - 98
  • [23] Incremental Multi-source Entity Resolution for Knowledge Graph Completion
    Saeedi, Alieh
    Peukert, Eric
    Rahm, Erhard
    SEMANTIC WEB (ESWC 2020), 2020, 12123 : 393 - 408
  • [24] μKG: A Library for Multi-source Knowledge Graph Embeddings and Applications
    Luo, Xindi
    Sun, Zequn
    Hu, Andwei
    SEMANTIC WEB - ISWC 2022, 2022, 13489 : 610 - 627
  • [25] KG-CFSA: a comprehensive approach for analyzing multi-source heterogeneous social network knowledge graph
    Akinnubi, Abiola
    Alassad, Mustafa
    Amure, Ridwan
    Agarwal, Nitin
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [26] An Integration Model of Multi-Source Heterogeneous Audit Data
    Li Chunqiang
    Chai Weiyan
    Chen Linan
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON ELECTRONIC SCIENCE AND AUTOMATION CONTROL, 2015, 20 : 262 - 266
  • [27] SimbaQL: A Query Language for Multi-source Heterogeneous Data
    Li, Yuepeng
    Shen, Zhihong
    Li, Jianhui
    BIG SCIENTIFIC DATA MANAGEMENT, 2019, 11473 : 275 - 284
  • [28] Processing heterogeneous XML data from multi-source
    Wang, Tong
    Liu, Da-Xin
    Sun, Wei
    Lin, Xuanzuo
    MULTISENSOR, MULTISOURCE INFORMATIN FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATIONS 2006, 2006, 6242
  • [29] Querying multi-source heterogeneous fuzzy spatiotemporal data
    Bai, Luyi
    Li, Nan
    Liu, Lishuang
    Hao, Xuesong
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (05) : 9843 - 9854
  • [30] Urban Multi-Source Spatio-Temporal Data Analysis Aware Knowledge Graph Embedding
    Zhao, Ling
    Deng, Hanhan
    Qiu, Linyao
    Li, Sumin
    Hou, Zhixiang
    Sun, Hai
    Chen, Yun
    SYMMETRY-BASEL, 2020, 12 (02):