A solution and practice for combining multi-source heterogeneous data to construct enterprise knowledge graph

被引:1
|
作者
Yan, Chenwei [1 ,2 ]
Fang, Xinyue [3 ]
Huang, Xiaotong [1 ,2 ]
Guo, Chenyi [4 ]
Wu, Ji [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Natl Pilot Software Engn Sch, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Key Lab Trustworthy Distributed Comp & Serv BUPT, Minist Educ, Beijing, Peoples R China
[3] Tsinghua Univ, Sch Econ & Management, Beijing, Peoples R China
[4] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
FRONTIERS IN BIG DATA | 2023年 / 6卷
基金
中国国家自然科学基金;
关键词
knowledge graph construction; heterogeneous data; knowledge graph update; enterprise knowledge graph; graph database;
D O I
10.3389/fdata.2023.1278153
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The knowledge graph is one of the essential infrastructures of artificial intelligence. It is a challenge for knowledge engineering to construct a high-quality domain knowledge graph for multi-source heterogeneous data. We propose a complete process framework for constructing a knowledge graph that combines structured data and unstructured data, which includes data processing, information extraction, knowledge fusion, data storage, and update strategies, aiming to improve the quality of the knowledge graph and extend its life cycle. Specifically, we take the construction process of an enterprise knowledge graph as an example and integrate enterprise register information, litigation-related information, and enterprise announcement information to enrich the enterprise knowledge graph. For the unstructured text, we improve existing model to extract triples and the F1-score of our model reached 72.77%. The number of nodes and edges in our constructed enterprise knowledge graph reaches 1,430,000 and 3,170,000, respectively. Furthermore, for each type of multi-source heterogeneous data, we apply corresponding methods and strategies for information extraction and data storage and carry out a detailed comparative analysis of graph databases. From the perspective of practical use, the informative enterprise knowledge graph and its timely update can serve many actual business needs. Our proposed enterprise knowledge graph has been deployed in HuaRong RongTong (Beijing) Technology Co., Ltd. and is used by the staff as a powerful tool for corporate due diligence. The key features are reported and analyzed in the case study. Overall, this paper provides an easy-to-follow solution and practice for domain knowledge graph construction, as well as demonstrating its application in corporate due diligence.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Constructing TCM Knowledge Graph with Multi-Source Heterogeneous Data
    Zhai D.
    Lou Y.
    Kan H.
    He X.
    Liang G.
    Ma Z.
    Data Analysis and Knowledge Discovery, 2023, 7 (09) : 146 - 158
  • [2] Research on enterprise risk knowledge graph based on multi-source data fusion
    Yang, Bo
    Liao, Yi-ming
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (04): : 2569 - 2582
  • [3] Research on enterprise risk knowledge graph based on multi-source data fusion
    Bo Yang
    Yi-ming Liao
    Neural Computing and Applications, 2022, 34 : 2569 - 2582
  • [4] Construction of Knowledge Graph of Multi-Source Heterogeneous Distribution Network Systems
    Qin, Dandan
    Zheng, Gaofeng
    Liu, Li
    Li, Longyue
    Wang, Xing
    Zhang, Shujuan
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 158 - 162
  • [5] Construction and application of Chinese breast cancer knowledge graph based on multi-source heterogeneous data
    An, Bo
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (04) : 6776 - 6799
  • [6] Constructing the Power Knowledge graph by Multi-source Electricity Data
    Jiang, Guoyi
    Su, Linhua
    Liu, Haibo
    Cao, Yang
    Sun, Rui
    Diao, Fengxin
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS (CITS), 2020, : 111 - 115
  • [7] COgKGE: A Knowledge Graph Embedding Toolkit and Benchmark for Representing Multi-source and Heterogeneous Knowledge
    Jin, Zhuoran
    Men, Tianyi
    Yuan, Hongbang
    He, Zhitao
    Sui, Dianbo
    Wang, Chenhao
    Xue, Zhipeng
    Chen, Yubo
    Zhao, Jun
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2022, : 166 - 173
  • [8] Modeling of Multi-Modal Knowledge Graph for Assembly Process of Wind Turbines with Multi-Source Heterogeneous Data
    Hu, Zhiqiang
    Liu, Mingfei
    Li, Qi
    Li, Xinyu
    Bao, Jinsong
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2024, 58 (08): : 1249 - 1263
  • [9] Urban Flow Pattern Mining Based on Multi-Source Heterogeneous Data Fusion and Knowledge Graph Embedding
    Liu, Jia
    Li, Tianrui
    Ji, Shenggong
    Xie, Peng
    Du, Shengdong
    Teng, Fei
    Zhang, Junbo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 2133 - 2146
  • [10] Knowledge Graph Constructing and Applying for Neurosurgery Based on Multi-Source Heterogeneous Database
    Wang, Boran
    Zhou, Xuezhong
    Wei, Wei
    Wang, Rui
    Liu, Yiming
    Tian, Haoyu
    Dai, Xinyu
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2024, 44 (08): : 879 - 886