A knowledge extraction framework for domain-specific application with simplified pre-trained language model and attention-based feature extractor

被引:2
|
作者
Zhang, Jian [1 ]
Qin, Bo [1 ]
Zhang, Yufei [1 ]
Zhou, Junhua [2 ]
Wang, Hongwei [1 ]
机构
[1] Zhejiang Univ, ZJU UIUC Inst, Haining 314400, Zhejiang, Peoples R China
[2] Beijing Inst Elect Syst Engn, Beijing Simulat Ctr, Beijing 100000, Peoples R China
关键词
Knowledge extraction; Named entity recognition; Pre-trained language model; Attention mechanism;
D O I
10.1007/s11761-022-00337-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the advancement of industrial informatics, intelligent algorithms are increasingly applied in various industrial products and applications. In this paper, we proposed a knowledge extraction framework for domain-specific text. This framework can extract entities from text the subsequent tasks such as knowledge graph construction. The proposed framework contains three modules, namely domain feature pre-trained model, LSTM-based named entity recognition and the attention-based nested named entity recognition. The domain feature pre-trained model can effectively learn the features of domain corpus such as professional terms that are not included in the general domain corpus. Flat named entity recognition can use the vector from pre-trained model to obtain the entity from domain-specific text. The nested named entity recognition based on the attention mechanism and the weight sliding balance strategy can effectively identify entity types with higher nesting rates. The framework achieves good results in the field of nuclear power plant maintenance reports, and the methods for domain pre-trained model and LSTM-based flat named entity recognition have been successfully applied to practical tasks.
引用
收藏
页码:121 / 131
页数:11
相关论文
共 37 条
  • [21] Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
    Ji, Zhixiang
    Wang, Xiaohui
    Zhang, Jie
    Wu, Di
    GLOBAL ENERGY INTERCONNECTION-CHINA, 2023, 6 (04): : 493 - 504
  • [22] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
    Wilson Lau
    Kevin Lybarger
    Martin L. Gunn
    Meliha Yetisgen
    Journal of Digital Imaging, 2023, 36 : 91 - 104
  • [23] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
    Lau, Wilson
    Lybarger, Kevin
    Gunn, Martin L.
    Yetisgen, Meliha
    JOURNAL OF DIGITAL IMAGING, 2023, 36 (01) : 91 - 104
  • [24] Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
    Zhixiang Ji
    Xiaohui Wang
    Jie Zhang
    Di Wu
    Global Energy Interconnection, 2023, 6 (04) : 493 - 504
  • [25] VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction
    Lv, Bowen
    Wu, Huarui
    Chen, Wenbai
    Chen, Cheng
    Miao, Yisheng
    Zhao, Chunjiang
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 226
  • [26] Advanced Attention-Based Pre-Trained Transfer Learning Model for Accurate Brain Tumor Detection and Classification from MRI Images
    Priya, A.
    Vasudevan, V.
    OPTICAL MEMORY AND NEURAL NETWORKS, 2024, 33 (04) : 477 - 491
  • [27] A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training
    Tang, Zhan
    Guo, Xuchao
    Bai, Zhao
    Diao, Lei
    Lu, Shuhan
    Li, Lin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (03): : 771 - 791
  • [28] Knowledge-Aware Collaborative Filtering With Pre-Trained Language Model for Personalized Review-Based Rating Prediction
    Wang, Quanxiu
    Cao, Xinlei
    Wang, Jianyong
    Zhang, Wei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1170 - 1182
  • [29] Rule-based Natural Language Processing Approach to Detect Delirium on a Pre-Trained Deep Learning Model Framework
    Munoz, Ricardo
    Hua, Yining
    Seibold, Eva-Lotte
    Ahrens, Elena
    Redaelli, Simone
    Suleiman, Aiman
    von Wedel, Dario
    Ashrafian, Sarah
    Chen, Guanqing
    Schaefer, Maximilian
    Ma, Haobo
    ANESTHESIA AND ANALGESIA, 2023, 136 : 1028 - 1030
  • [30] An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model
    Zhang, Yufang
    Li, Jiayi
    Lin, Shenggeng
    Zhao, Jianwei
    Xiong, Yi
    Wei, Dong-Qing
    JOURNAL OF CHEMINFORMATICS, 2024, 16 (01):