Automatic documents classification

被引:1
|
作者
Mohamed, Hoda K. [1 ]
机构
[1] Ain Shams U, Fac Engn, Comp & Syst Engn Dept, Cairo, Egypt
关键词
text classification; information retrieve; Stemmer algorithm; natural language processing and neural networks;
D O I
10.1109/ICCES.2007.4447022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic document classification is of paramount importance to knowledge management in the information age. Document classification poses many challenges for learning systems since the feature vector used to represent a document must capture some of the complex semantics of natural language. In this paper, we design an automatic document classification system. We investigate the different parameters and design decisions that affect the building of automatic classifiers. The system creates an item vector for each document retrieved and assigns weights for each item. The vectors are selected using combined techniques from stemmer algorithm and natural language processing. Several weighting schema have been used. Documents are classified using neural network (NN). We investigate different cases applied to the NN classifier. Cases are classified according to weighting schema, effect of weighting words in the title, and the number of inputs to the classifier. Analyzing the performance of the classifier according to different cases is illustrated.
引用
收藏
页码:33 / 37
页数:5
相关论文
共 50 条
  • [21] Automatic Classification of Documents in a Natural Language: A Conceptual Model
    Lyfenko, N. D.
    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2014, 48 (03) : 158 - 166
  • [22] An automatic classification technique and tool for information retrieval of web documents
    Di Martino, B
    Mazzocca, N
    Squeglia, A
    Mazzeo, A
    CONCURRENT ENGINEERING: ENHANCED INTEROPERABLE SYSTEMS, 2003, : 1043 - 1050
  • [23] Research on Automatic Classification of Documents in Library Environment: A Literature Review
    Desale, Sanjay K.
    Kumbhar, Rajendra M.
    KNOWLEDGE ORGANIZATION, 2013, 40 (05): : 295 - 304
  • [24] Ontology-based automatic classification and ranking for web documents
    Fang, Jun
    Guo, Lei
    Wang, XiaoDong
    Yang, Ning
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 627 - 631
  • [25] An automatic classification of text documents based on correlative association of words
    Agnihotri, Deepak
    Verma, Kesari
    Tripathi, Priyanka
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 50 (03) : 549 - 572
  • [26] Automatic classification of academic documents using text mining techniques
    Nunez, Haydemar
    Ramos, Esmeralda
    2012 XXXVIII CONFERENCIA LATINOAMERICANA EN INFORMATICA (CLEI), 2012,
  • [27] An automatic classification of text documents based on correlative association of words
    Deepak Agnihotri
    Kesari Verma
    Priyanka Tripathi
    Journal of Intelligent Information Systems, 2018, 50 : 549 - 572
  • [28] Automatic constraints generation for semisupervised clustering: experiences with documents classification
    Irene Diaz-Valenzuela
    Vincenzo Loia
    Maria J. Martin-Bautista
    Sabrina Senatore
    M. Amparo Vila
    Soft Computing, 2016, 20 : 2329 - 2339
  • [29] Automatic constraints generation for semisupervised clustering: experiences with documents classification
    Diaz-Valenzuela, Irene
    Loia, Vincenzo
    Martin-Bautista, Maria J.
    Senatore, Sabrina
    Vila, M. Amparo
    SOFT COMPUTING, 2016, 20 (06) : 2329 - 2339
  • [30] Automatic classification and recognition of complex documents based on Faster RCNN
    Chen Jun
    Yang Suhua
    Jiang Shaofeng
    PROCEEDINGS OF 2019 14TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS (ICEMI), 2019, : 573 - 577