The VINITI RAS Automatic Text Classification System for Processing the Flow of Scientific Publications

被引:1
|
作者
Egorov, V. S. [1 ]
Kozlova, E. S. [1 ]
Lomotin, K. E. [1 ]
Fedorets, O. V. [1 ]
Filimonov, A. V. [1 ]
Shapkin, A. V. [1 ]
机构
[1] Russian Acad Sci, All Russian Inst Sci & Tech Informat VINITI, Moscow 125315, Russia
关键词
automatic text classification; Word2Vec; machine learning; perceptron; logistic regression; natural language processing; production technology of the information center;
D O I
10.3103/S0005105520030048
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the results of the development and testing of an automatic classification system for scientific texts that provides the functionality to determine the topic of texts by three classification schemes in batch and dialog modes. The structural and functional components, the methods used to assess the quality of classification, the teaching methodology, the selection of the optimal classification model, and the main areas for the introduction of an automatic classifier in the processing of electronic document flow at the VINITI RAS are described.
引用
收藏
页码:113 / 123
页数:11
相关论文
共 41 条