A novel topic clustering algorithm based on graph neural network for question topic diversity

被引:7
|
作者
Wu, Yongliang [1 ]
Wang, Xuejun [1 ]
Zhao, Wenbin [1 ]
Lv, Xiaofeng [2 ]
机构
[1] Shijiazhuang Tiedao Univ, Sch Informat Sci & Technol, Hebei 050043, Peoples R China
[2] Hebei Normal Univ, Coll Comp & Cyber Secur, Hebei 050024, Peoples R China
基金
中国国家自然科学基金;
关键词
Graph Neural Network; Topic clustering; Graph representation; MODEL; KNOWLEDGE; SENTIMENT;
D O I
10.1016/j.ins.2023.02.018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In community question answering, many questions have no topic labeling or the topic labeling is very diverse, which has become the biggest obstacle to building the bridge between users and posts. Topic clustering methods could alleviate this issue. However, existing research employed words as topic representation units and could not express topic semantic relevance. In this paper, we propose a novel Topic Clustering framework based on the Graph Neural Network (called TCGNN) to alleviate topic diversity in Community Question Answering. Firstly, we separately consider the relationship representation of existing topics and unlabeled topics. For manually labeled topics, we count the frequency of topics in community questions and construct a topic cooccurrence matrix to represent the topic relation. For unmarked topics, we extract the core phrases from community questions and employ them to indicate the topics of questions. Then, we transform the topic co-occurrence matrix into a topic relation graph, optimizing the topic relevance and improving presentation efficiency. Next, we employ a graph neural network for embedding the topic connection graph and get the vector representation of each topic. Finally, an improved K-mean method is proposed for topic clustering based on the distance of topic vectors. Additionally, we briefly discuss the extended effect of topic clustering methods in other domains (bibliographic information and reviews). In the literature we have, it is a primary work that conders topic clustering in multiple situations and offers innovative cogitation to apply graph neural networks in topic clustering. Our experiment compared prevalent clustering methods and some combination methods of text representation and graph embedding. The outcome of experiments on four extensive and varied datasets (Stack Overflow, DBLP, Yelp, and Zhihu) illustrate that TCGNN leads the prevalent baseline in Entropy and Purity.
引用
收藏
页码:685 / 702
页数:18
相关论文
共 50 条
  • [31] DAC: Descendant-aware clustering algorithm for network-based topic emergence prediction
    Jung, Sukhwan
    Segev, Aviv
    JOURNAL OF INFORMETRICS, 2022, 16 (03)
  • [32] Graph neural topic model with commonsense knowledge
    Zhu, Bingshan
    Cai, Yi
    Ren, Haopeng
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (02)
  • [33] Improving consumption diversity via graph-based topic nudging
    Vercoutere, Stefaan
    Joris, Glen
    De Pessemier, Toon
    Martens, Luc
    User Modeling and User-Adapted Interaction, 2025, 35 (02)
  • [34] Graph Local Clustering for Topic Detection in Web Collections
    Garza, Sara E.
    Brena, Ramon
    LA-WEB: 2009 LATIN AMERICAN WEB CONGRESS, 2009, : 207 - 213
  • [35] Novel Similarity Measure for Document Clustering Based on Topic Phrases
    ELdesoky, A. E.
    Saleh, M.
    Sakr, N. A.
    ICNM: 2009 INTERNATIONAL CONFERENCE ON NETWORKING & MEDIA CONVERGENCE, 2007, : 92 - +
  • [36] Toward topic diversity in recommender systems: integrating topic modeling with a hashing algorithm
    Yang, Donghui
    Wang, Yan
    Shi, Zhaoyang
    Wang, Huimin
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2025, 77 (01) : 47 - 69
  • [37] Automatic Topic Detection with an Incremental Clustering Algorithm
    Zhang, Xiaoming
    Li, Zhoujun
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 344 - 351
  • [38] A NOVEL TOPIC SELECTION ALGORITHM BASED ON WORD DISTRIBUTION
    Tsai, Chun-Wei
    Huang, Ko-Wei
    Hsu, Heng-Yao
    Chiang, Ming-Chao
    Yang, Chu-Sing
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (04): : 1843 - 1864
  • [39] Novel PageRank algorithm based on topic and link weighted
    Yang, Gelan
    Tu, Li
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2012, 40 (SUPPL.1): : 300 - 303
  • [40] Case-related Topic Summarization Based on Topic Interaction Graph
    Huang Y.-X.
    Yu Z.-T.
    Guo J.-J.
    Yu Z.-Q.
    Gao F.-Y.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (04): : 1796 - 1810