Document representation based on probabilistic word clustering in customer-voice classification

被引:0
|
作者
Younghoon Lee
Seokmin Song
Sungzoon Cho
Jinhae Choi
机构
[1] Seoul National University,Department of Industrial Engineering and Institute for Industrial Systems Innovation
[2] LG Electronics,Data Driven User Experience Team, Mobile Communication Lab
来源
关键词
Probabilistic word clustering; Document representation; Customer-voice; Classification;
D O I
暂无
中图分类号
学科分类号
摘要
Customer-voice data have an important role in different fields including marketing, product planning, and quality assurance. However, owing to the manual processes involved, there are problems associated with the classification of customer-voice data. This study focuses on building automatic classifiers for customer-voice data with newly proposed document representation methods based on neural-embedding and probabilistic word-clustering approaches. Semantically similar terms are classified into a common cluster. The words generated from neural embedding are clustered according to the membership strength of each word relative to each cluster derived from a probabilistic clustering method such as the fuzzy C-means clustering method or Gaussian mixture model. It is expected that the proposed method can be suitable for the classification of customer-voice data consisting of unstructured text by considering the membership strength. The results demonstrate that the proposed method achieved an accuracy of 89.24% with respect to representational effectiveness and an accuracy of 87.76% with respect to the classification performance of customer-voice data consisting of 12 classes. Further, the method provided an intuitive interpretation for the generated representation.
引用
收藏
页码:221 / 232
页数:11
相关论文
共 50 条
  • [1] Document representation based on probabilistic word clustering in customer-voice classification
    Lee, Younghoon
    Song, Seokmin
    Cho, Sungzoon
    Choi, Jinhae
    PATTERN ANALYSIS AND APPLICATIONS, 2019, 22 (01) : 221 - 232
  • [2] Applying convolution filter to matrix of word-clustering based document representation
    Lee, Younghoon
    Im, Jinbae
    Cho, Sungzoon
    Choi, Jinhae
    NEUROCOMPUTING, 2018, 315 : 210 - 220
  • [3] Document Representation with Statistical Word Senses in Cross-Lingual Document Clustering
    Tang, Guoyu
    Xia, Yunqing
    Cambria, Erik
    Jin, Peng
    Zheng, Thomas Fang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (02)
  • [4] Combining Distributed Word Representation and Document Distance for Short Text Document Clustering
    Kongwudhikunakorn, Supavit
    Waiyamai, Kitsana
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (02): : 277 - 300
  • [5] Document Classification Based on Word Vectors
    Liu, Rong
    Wang, Dong
    Xing, Chao
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 413 - 413
  • [6] Text sentiment classification based on a genetic algorithm and word and document co-clustering
    Kotelnikov, E. V.
    Pletneva, M. V.
    JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 2016, 55 (01) : 106 - 114
  • [7] Text sentiment classification based on a genetic algorithm and word and document co-clustering
    E. V. Kotelnikov
    M. V. Pletneva
    Journal of Computer and Systems Sciences International, 2016, 55 : 106 - 114
  • [8] WORD DISTRIBUTED REPRESENTATION BASED TEXT CLUSTERING
    Feng, Shan
    Liu, Ruifang
    Wang, Qinlong
    Shi, Ruisheng
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 389 - 393
  • [9] Research on customer classification based on fuzzy clustering
    School of Electron and Inf. Eng., Dalian Univ. of Techno, Dalian 116085, China
    不详
    J. Comput. Inf. Syst., 2007, 5 (1971-1976):
  • [10] Document Sentiment Classification based on the Word Embedding
    Yin, Yanping
    Jin, Zhong
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 456 - 461