Deep Neural Networks for Czech Multi-label Document Classification

被引:11
|
作者
Lenc, Ladislav [1 ,2 ]
Kral, Pavel [1 ,2 ]
机构
[1] Univ West Bohemia, Fac Appl Sci, Dept Comp Sci & Engn, Plzen, Czech Republic
[2] Univ West Bohemia, Fac Appl Sci, NTIS New Technol Informat Soc, Plzen, Czech Republic
关键词
Czech; Deep neural networks; Document classification; Multi-label; FEATURES;
D O I
10.1007/978-3-319-75487-1_36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is focused on automatic multi-label document classification of Czech text documents. The current approaches usually use some pre-processing which can have negative impact (loss of information, additional implementation work, etc). Therefore, we would like to omit it and use deep neural networks that learn from simple features. This choice was motivated by their successful usage in many other machine learning fields. Two different networks are compared: the first one is a standard multi-layer perceptron, while the second one is a popular convolutional network. The experiments on a Czech newspaper corpus show that both networks significantly outperform baseline method which uses a rich set of features with maximum entropy classifier. We have also shown that convolutional network gives the best results.
引用
收藏
页码:460 / 471
页数:12
相关论文
共 50 条
  • [1] Multi-label Document Classification in Czech
    Hrala, Michal
    Kral, Pavel
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 343 - 351
  • [2] Combination of Neural Networks for Multi-label Document Classification
    Lenc, Ladislav
    Kral, Pavel
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 278 - 282
  • [3] Multi-label Text Classification with Deep Neural Networks
    Chen, Yun
    Xiao, Bo
    Lin, Zhiqing
    Dai, Cheng
    Li, Zuochao
    Yang, Liping
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 409 - 413
  • [4] Neural Networks for Multi-lingual Multi-label Document Classification
    Martinek, Jiri
    Lenc, Ladislav
    Kral, Pavel
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 73 - 83
  • [5] Novel Unsupervised Features for Czech Multi-label Document Classification
    Brychcin, Tomas
    Kral, Pavel
    HUMAN-INSPIRED COMPUTING AND ITS APPLICATIONS, PT I, 2014, 8856 : 70 - 79
  • [6] Improving Multi-label Document Classification of Czech News Articles
    Lehecka, Jan
    Svec, Jan
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 307 - 315
  • [7] Novel unsupervised features for Czech multi-label document classification
    Brychcín, Tomáš (brychcin@kiv.zcu.cz), 1600, Springer Verlag (8856):
  • [8] Multi-label classification with imbalanced classes by fuzzy deep neural networks
    Succetti, Federico
    Rosato, Antonello
    Panella, Massimo
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2025, 32 (01) : 23 - 36
  • [9] MULTI-LABEL CLASSIFICATION OF REMOTE SENSING IMAGERY WITH DEEP NEURAL NETWORKS
    Alshehri, Aaliyah
    Bazi, Yakub
    Ammour, Nassim
    Alajlan, Naif
    2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), 2020, : 97 - 100
  • [10] Multi-label Classification with ART Neural Networks
    Sapozhnikova, Elena P.
    WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 144 - 147