A knowledge extraction framework for domain-specific application with simplified pre-trained language model and attention-based feature extractor

被引：2

作者：

Zhang, Jian ^{[1
]}

Qin, Bo ^{[1
]}

Zhang, Yufei ^{[1
]}

Zhou, Junhua ^{[2
]}

Wang, Hongwei ^{[1
]}

机构：

[1] Zhejiang Univ, ZJU UIUC Inst, Haining 314400, Zhejiang, Peoples R China

[2] Beijing Inst Elect Syst Engn, Beijing Simulat Ctr, Beijing 100000, Peoples R China

来源：

SERVICE ORIENTED COMPUTING AND APPLICATIONS | 2022年 / 16卷 / 02期

关键词：

Knowledge extraction; Named entity recognition; Pre-trained language model; Attention mechanism;

D O I：

10.1007/s11761-022-00337-5

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

With the advancement of industrial informatics, intelligent algorithms are increasingly applied in various industrial products and applications. In this paper, we proposed a knowledge extraction framework for domain-specific text. This framework can extract entities from text the subsequent tasks such as knowledge graph construction. The proposed framework contains three modules, namely domain feature pre-trained model, LSTM-based named entity recognition and the attention-based nested named entity recognition. The domain feature pre-trained model can effectively learn the features of domain corpus such as professional terms that are not included in the general domain corpus. Flat named entity recognition can use the vector from pre-trained model to obtain the entity from domain-specific text. The nested named entity recognition based on the attention mechanism and the weight sliding balance strategy can effectively identify entity types with higher nesting rates. The framework achieves good results in the field of nuclear power plant maintenance reports, and the methods for domain pre-trained model and LSTM-based flat named entity recognition have been successfully applied to practical tasks.

引用

页码：121 / 131

页数：11

共 37 条

[21] Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
Ji, Zhixiang
Wang, Xiaohui
Zhang, Jie
Wu, Di
GLOBAL ENERGY INTERCONNECTION-CHINA, 2023, 6 (04): : 493 - 504
[22] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
Wilson Lau
Kevin Lybarger
Martin L. Gunn
Meliha Yetisgen
Journal of Digital Imaging, 2023, 36 : 91 - 104
[23] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
Lau, Wilson
Lybarger, Kevin
Gunn, Martin L.
Yetisgen, Meliha
JOURNAL OF DIGITAL IMAGING, 2023, 36 (01) : 91 - 104
[24] Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
Zhixiang Ji
Xiaohui Wang
Jie Zhang
Di Wu
Global Energy Interconnection, 2023, 6 (04) : 493 - 504
[25] VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction
Lv, Bowen
Wu, Huarui
Chen, Wenbai
Chen, Cheng
Miao, Yisheng
Zhao, Chunjiang
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 226
[26] Advanced Attention-Based Pre-Trained Transfer Learning Model for Accurate Brain Tumor Detection and Classification from MRI Images
Priya, A.
Vasudevan, V.
OPTICAL MEMORY AND NEURAL NETWORKS, 2024, 33 (04) : 477 - 491
[27] A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training
Tang, Zhan
Guo, Xuchao
Bai, Zhao
Diao, Lei
Lu, Shuhan
Li, Lin
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (03): : 771 - 791
[28] Knowledge-Aware Collaborative Filtering With Pre-Trained Language Model for Personalized Review-Based Rating Prediction
Wang, Quanxiu
Cao, Xinlei
Wang, Jianyong
Zhang, Wei
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1170 - 1182
[29] Rule-based Natural Language Processing Approach to Detect Delirium on a Pre-Trained Deep Learning Model Framework
Munoz, Ricardo
Hua, Yining
Seibold, Eva-Lotte
Ahrens, Elena
Redaelli, Simone
Suleiman, Aiman
von Wedel, Dario
Ashrafian, Sarah
Chen, Guanqing
Schaefer, Maximilian
Ma, Haobo
ANESTHESIA AND ANALGESIA, 2023, 136 : 1028 - 1030
[30] An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model
Zhang, Yufang
Li, Jiayi
Lin, Shenggeng
Zhao, Jianwei
Xiong, Yi
Wei, Dong-Qing
JOURNAL OF CHEMINFORMATICS, 2024, 16 (01):

← 1 2 3 4 →