A solution and practice for combining multi-source heterogeneous data to construct enterprise knowledge graph

被引:1
|
作者
Yan, Chenwei [1 ,2 ]
Fang, Xinyue [3 ]
Huang, Xiaotong [1 ,2 ]
Guo, Chenyi [4 ]
Wu, Ji [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Natl Pilot Software Engn Sch, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Key Lab Trustworthy Distributed Comp & Serv BUPT, Minist Educ, Beijing, Peoples R China
[3] Tsinghua Univ, Sch Econ & Management, Beijing, Peoples R China
[4] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
FRONTIERS IN BIG DATA | 2023年 / 6卷
基金
中国国家自然科学基金;
关键词
knowledge graph construction; heterogeneous data; knowledge graph update; enterprise knowledge graph; graph database;
D O I
10.3389/fdata.2023.1278153
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The knowledge graph is one of the essential infrastructures of artificial intelligence. It is a challenge for knowledge engineering to construct a high-quality domain knowledge graph for multi-source heterogeneous data. We propose a complete process framework for constructing a knowledge graph that combines structured data and unstructured data, which includes data processing, information extraction, knowledge fusion, data storage, and update strategies, aiming to improve the quality of the knowledge graph and extend its life cycle. Specifically, we take the construction process of an enterprise knowledge graph as an example and integrate enterprise register information, litigation-related information, and enterprise announcement information to enrich the enterprise knowledge graph. For the unstructured text, we improve existing model to extract triples and the F1-score of our model reached 72.77%. The number of nodes and edges in our constructed enterprise knowledge graph reaches 1,430,000 and 3,170,000, respectively. Furthermore, for each type of multi-source heterogeneous data, we apply corresponding methods and strategies for information extraction and data storage and carry out a detailed comparative analysis of graph databases. From the perspective of practical use, the informative enterprise knowledge graph and its timely update can serve many actual business needs. Our proposed enterprise knowledge graph has been deployed in HuaRong RongTong (Beijing) Technology Co., Ltd. and is used by the staff as a powerful tool for corporate due diligence. The key features are reported and analyzed in the case study. Overall, this paper provides an easy-to-follow solution and practice for domain knowledge graph construction, as well as demonstrating its application in corporate due diligence.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Multi-Source and Heterogeneous Knowledge Organization and Representation for Knowledge Fusion in Cloud Manufacturing
    Liu, Jihong
    Xu, Wenting
    Zhan, Hongfei
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON SOFT COMPUTING TECHNIQUES AND ENGINEERING APPLICATION, ICSCTEA 2013, 2014, 250 : 55 - 61
  • [32] Multi-source heterogeneous data storage methods for omnimedia data space
    Zhuo, Wenbo
    INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2024, 15 (3-4) : 314 - 322
  • [33] Knowledge Graph Construction Research From Multi-source Vulnerability Intelligence
    Du, Lin
    Xu, Chuanqi
    CYBER SECURITY, CNCERT 2022, 2022, 1699 : 177 - 184
  • [34] Multi-source Education Knowledge Graph Construction and Fusion for College Curricula
    Li, Zeju
    Cheng, Linya
    Zhang, Chunhong
    Zhu, Xinning
    Zhao, Hui
    2023 IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, ICALT, 2023, : 359 - 363
  • [35] A WEBGIS FOR SHARING AND INTEGRATION OF MULTI-SOURCE HETEROGENEOUS SPATIAL DATA
    Tang, Jianzhi
    Ren, Yingchao
    Yang, Chongjun
    Shen, Lei
    Jiang, Jun
    2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 2943 - 2946
  • [36] Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data
    Yang, Xin
    Yan, Qi Jing
    Wu, Mi Xia
    ACTA MATHEMATICA SINICA-ENGLISH SERIES, 2024, 40 (11) : 2751 - 2770
  • [37] Construction of a multi-source heterogeneous hybrid platform for big data
    Wang, Ying
    Liu, Yiding
    Xia, Minna
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2021, 21 (03) : 713 - 722
  • [38] Multi-source Heterogeneous Data Fusion for Toxin Level Quantification
    Strelet, Eugeniu
    Wang, Zhenyu
    Peng, You
    Castillo, Ivan
    Rendall, Ricardo
    Braun, Bea
    Joswiak, Mark
    Chiang, Leo
    Reis, Marco S.
    IFAC PAPERSONLINE, 2021, 54 (03): : 67 - 72
  • [39] Scalable Recommendation Models Fusing Multi-Source Heterogeneous Data
    Ji Z.-Y.
    Wu M.-D.
    Yang C.
    Li J.-D.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2021, 44 (03): : 106 - 111
  • [40] Neural TV program recommendation with multi-source heterogeneous data
    Yin, Fulian
    Xing, Tongtong
    Wu, Zhaoliang
    Feng, Xiaoli
    Ji, Meiqi
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119