Adaptive entity extraction method based on distant supervision

被引:0
|
作者
Ge L. [1 ]
Zhang Y. [2 ]
Li W. [2 ]
机构
[1] School of Computer Science and Technology, University of Science and Technology of China, Hefei
[2] School of Software and Microelectronics, Peking University, Beijing
关键词
A bidirectional long short-term memory neural network; Deep learning; Domain-specific knowledge graph; Entity extraction; Knowledge graph building; Ontology design; Positive unlabeled learning; Remote supervision;
D O I
10.11990/jheu.202011020
中图分类号
学科分类号
摘要
The traditional domain knowledge entity extraction algorithm mainly depends on the professional knowledge of experts, which requires a large amount of annotation workload and is difficult to apply in new fields. To solve this problem, this paper proposes an entity extraction algorithm based on remote supervision and applies it to the field of grain and oil storage. Under the framework of positive unlabeled learning, the algorithm performs entity extraction through two stages of entity determination and entity classification. First, a bidirectional Long Short-Term Memory neural network(BiLSTM) was used for two-class entity identification. Second, the fully connected network was used for entity type identification. Finally, the algorithm was used to extract entities to construct a knowledge graph in the field of grain and oil storage, which verified the feasibility of the algorithm. This algorithm is suitable for entity extraction tasks with few training entity samples and reduces the corpus size required for the BiLSTM-based algorithm entity extraction. Moreover, it achieves comparable results to those of the classical BiLSTM-based algorithm. Copyright ©2022 Journal of Harbin Engineering University.
引用
收藏
页码:564 / 571
页数:7
相关论文
共 27 条
  • [1] (2019)
  • [2] STEINER Thomas, VERBORGH Ruben, TRONCY Raphael, Et al., Adding realtime coverage to the google knowledge graph, 11th International Semantic Web Conference, pp. 11-15, (2012)
  • [3] ZOU Lei, OZSU Tamer, CHEN Lei, Et al., gStore: a graph-based SPARQL query engine, The VLDB Journal-The International Journal on Very Large Data Bases, 23, 4, pp. 565-590, (2014)
  • [4] GUHA R, MCCOOL R, MILLER E., Semantic search, Proceedings of the Twelfth International Conference on World Wide Web, (2003)
  • [5] DONG Xin, GABRILOVICH E, HEITZ G, Et al., Knowledge vault: a web-scale approach to probabilistic knowledge fusion, KDD'14: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601-610, (2014)
  • [6] CUI Wanyun, XIAO Yanghua, WANG Haixun, Et al., KBQA: learning question answering over QA corpora and knowledge bases, Proceedings of VLDB Endow, pp. 565-576, (2017)
  • [7] YAO Xuchen, VAN DURME B., Information extraction over structured data: question answering with freebase, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, pp. 956-966, (2014)
  • [8] YANG B, MITCHELL T., Leveraging knowledge bases in LSTMs for improving machine reading, Meeting of the Association for Computational Linguistics, pp. 1436-1446, (2017)
  • [9] WANG Jin, WANG Zhongyuan, ZHANG Dawei, Et al., Combining knowledge with deep convolutional neural networks for short text classification, Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI'17), pp. 2915-2921, (2017)
  • [10] BELLOMARINI L, GOTTLOB G, PIERIS A, Et al., Swift logic for big data and knowledge graphs, Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 3-16, (2017)