The VINITI RAS Automatic Text Classification System for Processing the Flow of Scientific Publications

被引:1
|
作者
Egorov, V. S. [1 ]
Kozlova, E. S. [1 ]
Lomotin, K. E. [1 ]
Fedorets, O. V. [1 ]
Filimonov, A. V. [1 ]
Shapkin, A. V. [1 ]
机构
[1] Russian Acad Sci, All Russian Inst Sci & Tech Informat VINITI, Moscow 125315, Russia
关键词
automatic text classification; Word2Vec; machine learning; perceptron; logistic regression; natural language processing; production technology of the information center;
D O I
10.3103/S0005105520030048
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the results of the development and testing of an automatic classification system for scientific texts that provides the functionality to determine the topic of texts by three classification schemes in batch and dialog modes. The structural and functional components, the methods used to assess the quality of classification, the teaching methodology, the selection of the optimal classification model, and the main areas for the introduction of an automatic classifier in the processing of electronic document flow at the VINITI RAS are described.
引用
收藏
页码:113 / 123
页数:11
相关论文
共 41 条
  • [1] The VINITI RAS Automatic Text Classification System for Processing the Flow of Scientific Publications
    V. S. Egorov
    E. S. Kozlova
    K. E. Lomotin
    O. V. Fedorets
    A. V. Filimonov
    A. V. Shapkin
    Automatic Documentation and Mathematical Linguistics, 2020, 54 : 113 - 123
  • [4] SYSTEM FOR AUTOMATIC CLASSIFICATION OF SCIENTIFIC LITERATURE
    GARFIELD, E
    MALIN, MV
    SMALL, H
    JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 1975, 57 (02): : 61 - 74
  • [5] Transformation of Thematic Profiles of Serial Publications in an Information Center Documents Input System: Case Study of the VINITI RAS Database
    Soloshenko, N. S.
    Fedorets, O. V.
    Domnina, T. N.
    SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2022, 49 (04) : 220 - 230
  • [6] Transformation of Thematic Profiles of Serial Publications in an Information Center Documents Input System: Case Study of the VINITI RAS Database
    N. S. Soloshenko
    O. V. Fedorets
    T. N. Domnina
    Scientific and Technical Information Processing, 2022, 49 : 220 - 230
  • [7] ROETEXT - SYSTEM FOR AUTOMATIC TEXT PROCESSING IN DIAGNOSTIC RADIOLOGY
    NOVAK, D
    RADIOLOGE, 1974, 14 (06): : 277 - 285
  • [8] Automatic text summarization of scientific articles based on classification of extract's population
    Jaoua, M
    Ben Hamadou, A
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 623 - 634
  • [9] Image retrieval from scientific publications: Text and image content processing to separate multipanel figures
    Apostolova, Emilia
    You, Daekeun
    Xue, Zhiyun
    Antani, Sameer
    Demner-Fushman, Dina
    Thoma, George R.
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2013, 64 (05): : 893 - 908
  • [10] Classification of Text Processing Components: The Tesla Role System
    Hermes, Juergen
    Schwiebert, Stephan
    ADVANCES IN DATA ANALYSIS, DATA HANDLING AND BUSINESS INTELLIGENCE, 2010, : 285 - 294