Text Classification Based on Word2vec and Convolutional Neural Network

被引:5
|
作者
Li, Lin [1 ]
Xiao, Linlong [1 ]
Jin, Wenzhen [1 ]
Zhu, Hong [1 ]
Yang, Guocai [1 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing, Peoples R China
关键词
Text classification; Text representation; Word2vec; Convolutional neural network;
D O I
10.1007/978-3-030-04221-9_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text representations in text classification usually have high dimensionality and are lack of semantics, resulting in poor classification effect. In this paper, TF-IDF is optimized by using optimization factors, then word2vec with semantic information is weighted, and the single-text representation model CD_STR is obtained. Based on the CD_STR model, the latent semantic index (LSI) and the TF-IDF weighted vector space model (T_VSM) are merged to obtain a fusion model, CD_MTR, which is more efficient. The text classification method MTR_MCNN of the fusion model CD_MTR combined with convolutional neural network is further proposed. This method first designs convolution kernels of different sizes and numbers, allowing them to extract text features from different aspects. Then the text vectors trained by the CD_MTR model are used as the input to the improved convolutional neural network. Tests on two datasets have verified that the performance of the two models, CD_STR and CD_MTR, is superior to other comparable textual representation models. The classification effect of MTR_MCNN method is better than that of other comparison methods, and the classification accuracy is higher than that of CD_MTR model.
引用
收藏
页码:450 / 460
页数:11
相关论文
共 50 条
  • [21] Screening ideas in the early stages of technology development: A word2vec and convolutional neural network approach
    Hong, Suckwon
    Kim, Juram
    Woo, Han-Gyun
    Kim, Young-Choon
    Lee, Changyong
    TECHNOVATION, 2022, 112
  • [22] Support Vector Machines and Word2vec for Text Classification with Semantic Features
    Lilleberg, Joseph
    Zhu, Yun
    Zhang, Yanqing
    PROCEEDINGS OF 2015 IEEE 14TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2015, : 136 - 140
  • [23] A Method of Feature Selection Based on Word2Vec in Text Categorization
    Tian, Wenfeng
    Li, Jun
    Li, Hongguang
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9452 - 9455
  • [24] Word2vec and dictionary based approach for uyghur text filtering
    Tohti, Turdi
    Zhao, Yunxing
    Musajan, Winira
    2ND ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI2017), 2017, 887
  • [25] Turkish Document Classification Based on Word2Vec and SVM Classifier
    Sahin, Gurkan
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [26] Chinese comments sentiment classification based on word2vec and SVMperf
    Zhang, Dongwen
    Xu, Hua
    Su, Zengcai
    Xu, Yunfeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (04) : 1857 - 1863
  • [27] A text retrieval algorithm based on the hybrid LDA and Word2Vec model
    Mu, Xue
    2019 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA & SMART CITY (ICITBS), 2019, : 373 - 376
  • [28] Research on Semantic Prediction Analysis of Tibetan Text Based on Word2Vec
    Ding Hai-lan
    Yu Hong-zhi
    Qi Kun-yu
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [29] Convolutional Neural Network with Contextualized Word Embedding for Text Classification
    Fan, Gaoyang
    Zhu, Cui
    Zhu, Wenjun
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [30] Arabic Text Keywords Extraction using Word2vec
    Suleiman, Dima
    Awajan, Arafat A.
    Al Etaiwi, Wael
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 251 - 257