Research on Knowledge Base Error Detection Method Based on Confidence Learning

被引:0
|
作者
Li W. [1 ,2 ]
Zhang Z. [1 ,2 ,3 ]
机构
[1] National Science Library, Chinese Academy of Sciences, Beijing
[2] Department of Library, Information and Archives Mangement, School of Economic and Management, University of Chinese Academy of Sciences, Beijing
[3] Hubei Key Laboratory of Big Data in Science and Technology, Wuhan
关键词
Confidence Learning; Error Detection; Knowledge Base;
D O I
10.11925/infotech.2096-3467.2021.0179
中图分类号
学科分类号
摘要
[Objective] This paper explores the error detection method for knowledge base with the help of confidence learning, aiming to reduce the noise data. [Methods] We used the TransE model to represent knowledge base triples, and used the multi-layer perceptron model to detect errors. Then, we cleaned the dataset with confidence learning, and reduced the influence of noise data through multiple rounds of iterative training. [Results] We examined our new method with DBpedia datasets, and found the optimal F1 value reached 0.736 4, which is better than the control group. [Limitations] The noise data in the experiment was artificially generated and was different from the distribution of real world data. More research is needed to evaluate our method with larger knowledge bases. [Conclusions] The proposed method could reduce the influence of noise data through confidence learning, and more effectively detect knowledge base errors. © 2023 Chin J Gen Pract. All rights reserved.
引用
收藏
页码:1 / 9
页数:8
相关论文
共 19 条
  • [1] Dong X, Gabrilovich E, Heitz G, Et al., Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601-610, (2014)
  • [2] Auer S, Bizer C, Kobilarov G, Et al., DBpedia: A Nucleus for a Web of Open Data, Proceedings of International Semantic Web Conference, Asian Semantic Web Conference, pp. 722-735, (2007)
  • [3] Bollacker K, Evans C, Paritosh P, Et al., FreeBase: A Collaboratively Created Graph Database for Structuring Human Knowledge, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247-1250, (2008)
  • [4] Heindorf S, Potthast M, Stein B, Et al., Vandalism Detection in Wikidata, Proceedings of the 25th ACM International Conference on Information and Knowledge Management, pp. 327-336, (2016)
  • [5] Aktolga E, Cartright M A, Allan J., Cross-document Cross-lingual Coreference Retrieval, Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 1359-1360, (2008)
  • [6] Pilz A, Paass G., From Names to Entities Using Thematic Context Distance, Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 857-866, (2011)
  • [7] Vapnik V N, Lerner A Y., Recognition of Patterns with Help of Generalized Portraits[J], Avtomatika i Telemekhanika, 24, 6, pp. 774-780, (1963)
  • [8] Carlson A, Betteridge J, Wang R C, Et al., Coupled Semi-supervised Learning for Information Extraction, Proceedings of the 3rd ACM International Conference on Web Search and Data Mining, pp. 101-110, (2010)
  • [9] Bordes A, Usunier N, Garcia-Duran A, Et al., Translating Embeddings for Modeling Multi-Relational Data, Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 2787-2795, (2013)
  • [10] Lin Y K, Liu Z Y, Sun M S, Et al., Learning Entity and Relation Embeddings for Knowledge Graph Completion, Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2181-2187, (2015)