Review of Automatic Citation Classification Based on Machine Learning

被引:0
|
作者
Zhou Z. [1 ]
机构
[1] Health Science Library, Peking University, Beijing
关键词
Automatic Citation Classification; Citation Content Analysis; Machine Learning; Natural Language Processing; Text Classification;
D O I
10.11925/infotech.2096-3467.2021.0608
中图分类号
学科分类号
摘要
[Objective] This paper summarizes the application of natural language processing and machine learning technology in automatic citation classification. [Coverage] We searched “citation classification”, “citation polarity”,“citation function”and“feature selection”with Scopus database, and retrieved a total of 46 representative literature. [Methods] These research was reviewed from the perspectives of citation classification process, tasks and methods. Then, we discussed their future development trends and challenges. [Results] The research of citation classification is shifting from multi-class to binary class. Deep learning model can classify sentiments and functions of citations simultaneously. The challenges facing automatic citation classification include single discipline corpus, controversial definition of citation contexts and unbalanced classification data. [Limitations] This review does not discuss many classification systems in the industry. [Conclusions] We need to develop the evaluation method for re-using scientific research data such as codes, data and corpus, which could help to build open science. Combining citation classification and counts could establish a multi-dimensional evaluation model. Based on the user’s search results, the system could recommend documents supporting or objecting the related research for further reading. © 2021, Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:14 / 24
页数:10
相关论文
共 55 条
  • [1] Hirsch J E., An Index to Quantify an Individual’s Scientific Research Output, Proceedings of the National Academy of Sciences of the United States of America, 102, 46, pp. 16569-16572, (2005)
  • [2] Egghe L., Theory and Practise of the G-Index, Scientometrics, 69, 1, pp. 131-152, (2006)
  • [3] Metron R K., The Sociology of Science: Theoretical and Empirical Investigations, pp. 50-62, (1973)
  • [4] Geras A, Siudem G, Gagolewski M., Should We Introduce a Dislike Button for Academic Articles? [J], Journal of the Association for Information Science and Technology, 71, 2, pp. 221-229, (2020)
  • [5] Gilbert G N., Referencing as Persuasion, Social Studies of Science, 7, 1, pp. 113-122, (1977)
  • [6] Lu Wei, Meng Rui, Liu Xingbang, A Deep Scientific Literature Mining-Oriented Framework for Citation Content Annotation, Journal of Library Science in China, 40, 6, pp. 93-104, (2014)
  • [7] Aljaber B, Martinez D, Stokes N, Et al., Improving MeSH Classification of Biomedical Articles Using Citation Contexts, Journal of Biomedical Informatics, 44, 5, pp. 881-896, (2011)
  • [8] Zhang G, Ding Y, Milojevic S., Citation Content Analysis (CCA): A Framework for Syntactic and Semantic Analysis of Citation Content[J], Journal of the American Society for Information Science and Technology, 64, 7, pp. 1490-1503, (2013)
  • [9] Cronin B., The Citation Process: The Role and Significance of Citations in Scientific Communication, pp. 26-28, (1984)
  • [10] Abu-Jbara A, Radev D., Reference Scope Identification in Citing Sentences, Proceedings of 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 80-90, (2012)