Examining the Coherence of the Top Ranked Tweet Topics

被引:8
|
作者
Fang, Anjie [1 ]
Macdonald, Craig [1 ]
Ounis, Iadh [1 ]
Habel, Philip [1 ]
机构
[1] Univ Glasgow, Glasgow, Lanark, Scotland
关键词
D O I
10.1145/2911451.2914731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Topic modelling approaches help scholars to examine the topics discussed in a corpus. Due to the popularity of Twitter, two distinct methods have been proposed to accommodate the brevity of tweets: the tweet pooling method and Twitter LDA. Both of these methods demonstrate a higher performance in producing more interpretable topics than the standard Latent Dirichlet Allocation (LDA) when applied on tweets. However, while various metrics have been proposed to estimate the coherence of the generated topics from tweets, the coherence of the top ranked topics, those that are most likely to be examined by users, has not been investigated. In addition, the effect of the number of generated topics K on the topic coherence scores has not been studied. In this paper, we conduct large-scale experiments using three topic modelling approaches over two Twitter datasets, and apply a state-of-the-art coherence metric to study the coherence of the top ranked topics and how K affects such coherence. Inspired by ranking metrics such as precision at n, we use coherence at n to assess the coherence of a topic model. To verify our results, we conduct a pairwise user study to obtain human preferences over topics. Our findings are threefold: we find evidence that Twitter LDA outperforms both LDA and the tweet pooling method because the top ranked topics it generates have more coherence; we demonstrate that a larger number of topics (K) helps to generate topics with more coherence; and finally, we show that coherence at n is more effective when evaluating the coherence of a topic model than the average coherence score.
引用
收藏
页码:825 / 828
页数:4
相关论文
共 50 条
  • [1] Examining what people tweet in relation to halal cosmetics-related topics
    Ainin, Sulaiman
    Feizollah, Ali
    Anuar, Nor Badrul
    Abdullah, Nor Aniza Binti
    Sahran, Muhammad Nur Firdaus
    COGENT ARTS & HUMANITIES, 2020, 7 (01):
  • [2] How the topics were ranked
    Giles, J
    NATURE, 2006, 441 (7091) : 265 - 265
  • [3] Dynamic characteristics of tweeting and tweet topics
    Hyun Woong Kwon
    M. Y. Choi
    Ho Sung Kim
    Keumsook Lee
    Journal of the Korean Physical Society, 2012, 60 : 590 - 594
  • [4] Dynamic Characteristics of Tweeting and Tweet Topics
    Kwon, Hyun Woong
    Choi, M. Y.
    Kim, Ho Sung
    Lee, Keumsook
    JOURNAL OF THE KOREAN PHYSICAL SOCIETY, 2012, 60 (04) : 590 - 594
  • [5] Top IOUs ranked
    不详
    POWER ENGINEERING, 1998, 102 (10) : 12 - +
  • [6] TOP RANKED ABSTRACTS
    不详
    JOURNAL OF MEDICAL IMAGING AND RADIATION SCIENCES, 2013, 44 (01) : 45 - 58
  • [7] Extracting Turkish Tweet Topics Using LDA
    Gemci, Fahriye
    Peker, Kadir A.
    2013 8TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2013, : 531 - 534
  • [8] Deriving Topics in Twitter by Exploiting Tweet Interactions
    Nugroho, Robertus
    Yang, Jian
    Zhong, Youliang
    Paris, Cecile
    Nepal, Surya
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 87 - 94
  • [9] MUDD,HARVEY - TOP RANKED
    RIGGS, HE
    SCIENCE, 1995, 267 (5199) : 776 - 776
  • [10] Top weather apps ranked
    Keates, Steven
    Harris, Dan
    WEATHER, 2024, 79 (12) : 383 - 383