Fuzzy Bag-of-Topics Model for Short Text Representation

被引:0
|
作者
Jia, Hao [1 ]
Li, Qing [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
关键词
Short text; Representation learning; Word communities;
D O I
10.1007/978-3-030-04221-9_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text representation is the keystone in many NLP tasks. For short text representation learning, the traditional Bag-of-Words model (BoW) is often criticized for sparseness and neglecting semantic information. Fuzzy Bag-of-Words (FBoW) and Fuzzy Bag-of-Words Cluster (FBoWC) model are the improved model of BoW, which can learn dense and meaningful document vectors. However, word clusters in FBoWC model are obtained by K-means cluster algorithm, which is unstable and may result in incoherent word clusters if not initialized properly. In this paper, we propose the Fuzzy Bag-of-Topics model (FBoT) to learn short text vector. In FBoT model, word communities, which are more coherent than word clusters in FBoWC, are used as basis terms in text vector. Experimental results of short text classification on two datasets show that FBoT achieves the highest classification accuracies.
引用
收藏
页码:473 / 482
页数:10
相关论文
共 50 条
  • [31] Unsupervised Leraning for Sematic Representation of Short Text
    Dong, Chenxi
    Jia, Haoran
    Wang, Cong
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 475 - 478
  • [32] Short Text Representation for Detecting Churn in Microblogs
    Amiri, Hadi
    Daume, Hal, III
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2566 - 2572
  • [33] Enriching short text representation in microblog for clustering
    Jiliang Tang
    Xufei Wang
    Huiji Gao
    Xia Hu
    Huan Liu
    Frontiers of Computer Science, 2012, 6 : 88 - 101
  • [34] On some topics of fuzzy differential equations and fuzzy optimization problems via a parametric representation of fuzzy numbers
    Saito, S
    CONTEMPORARY DIFFERENTIAL EQUATIONS AND APPLICATIONS, 2004, : 49 - 66
  • [35] Towards Visual Words to Words Text Detection with a General Bag of Words Representation
    Mehta, Rakesh
    Chum, Ondrej
    Matas, Jiri
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 641 - 645
  • [36] A Short-Text Oriented Clustering Method for Hot Topics Extraction
    Zheng, Yan
    Meng, Zhaopeng
    Xu, Chao
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2015, 25 (03) : 453 - 471
  • [37] A Metamodel Enabled Approach for Discovery of Coherent Topics in Short Text Microblogs
    Wandabwa, Herman
    Naeem, Muhammad Asif
    Pears, Russel
    Mirza, Farhaan
    IEEE ACCESS, 2018, 6 : 65582 - 65593
  • [38] Fuzzy topic modeling approach for text mining over short text
    Rashid, Junaid
    Shah, Syed Muhammad Adnan
    Irtaza, Aun
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)
  • [39] A Bag of Constrained Visual Words Model for Image Representation
    Mukherjee, Anindita
    Sil, Jaya
    Chowdhury, Ananda S.
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 2, 2020, 1024 : 403 - 415
  • [40] Repetition effects from paraphrased text: Evidence for an integrated representation model of text representation
    Raney, GE
    Therriault, DJ
    Minkoff, SRB
    DISCOURSE PROCESSES, 2000, 29 (01) : 61 - 81