Text Classification Based on Word2vec and Convolutional Neural Network

被引:5
|
作者
Li, Lin [1 ]
Xiao, Linlong [1 ]
Jin, Wenzhen [1 ]
Zhu, Hong [1 ]
Yang, Guocai [1 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing, Peoples R China
关键词
Text classification; Text representation; Word2vec; Convolutional neural network;
D O I
10.1007/978-3-030-04221-9_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text representations in text classification usually have high dimensionality and are lack of semantics, resulting in poor classification effect. In this paper, TF-IDF is optimized by using optimization factors, then word2vec with semantic information is weighted, and the single-text representation model CD_STR is obtained. Based on the CD_STR model, the latent semantic index (LSI) and the TF-IDF weighted vector space model (T_VSM) are merged to obtain a fusion model, CD_MTR, which is more efficient. The text classification method MTR_MCNN of the fusion model CD_MTR combined with convolutional neural network is further proposed. This method first designs convolution kernels of different sizes and numbers, allowing them to extract text features from different aspects. Then the text vectors trained by the CD_MTR model are used as the input to the improved convolutional neural network. Tests on two datasets have verified that the performance of the two models, CD_STR and CD_MTR, is superior to other comparable textual representation models. The classification effect of MTR_MCNN method is better than that of other comparison methods, and the classification accuracy is higher than that of CD_MTR model.
引用
收藏
页码:450 / 460
页数:11
相关论文
共 50 条
  • [1] Text classification based on word2vec and convolutional neural networks
    Fan, Xiaojing
    Jiang, Mingyang
    Pei, Zhili
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 77 - 78
  • [2] Classification Bullying Tweet Using Convolutional Neural Network with Word2vec
    Ricko
    Sasongko, Priyo Sidik
    2021 5TH INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS 2021), 2021,
  • [3] A News Recommendation Algorithm Based on Word2vec and Convolutional Neural Network
    Ding, Zhengqi
    Sun, Chang
    Sun, Gang
    Liu, Qihang
    Ma, Zhiyuan
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 96 - 100
  • [4] Research on Chinese Text Classification Based on Word2vec
    Yang, Zhi-Tong
    Zheng, Jun
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1166 - 1170
  • [5] Microblogging Short Text Classification based on Word2Vec
    Zhang, Yonghui
    Liu, Jingang
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ELECTRONIC, MECHANICAL, INFORMATION AND MANAGEMENT SOCIETY (EMIM), 2016, 40 : 395 - 401
  • [6] Short Text Classification Based on Wikipedia and Word2vec
    Liu Wensen
    Cao Zewen
    Wang Jun
    Wang Xiaoyi
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1195 - 1200
  • [7] Word2vec convolutional neural networks for classification of news articles and tweets
    Jang, Beakcheol
    Kim, Inhwan
    Kim, Jong Wook
    PLOS ONE, 2019, 14 (08):
  • [8] Text Classification Research Based on Improved Word2vec and CNN
    Gao, Mengyuan
    Li, Tinghui
    Huang, Peifang
    SERVICE-ORIENTED COMPUTING, ICSOC 2018, 2019, 11434 : 126 - 135
  • [9] Diet Health Text Classification Based on word2vec and LSTM
    Zhao M.
    Du H.
    Dong C.
    Chen C.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2017, 48 (10): : 202 - 208
  • [10] Research on patent text classification based on Word2Vec and LSTM
    Xiao, Lizhong
    Wang, Guangzhong
    Zuo, Yang
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2018, : 71 - 74