Topic representation model based on microblogging behavior analysis

被引:10
|
作者
Han, Weihong [1 ]
Tian, Zhihong [1 ]
Huang, Zizhong [2 ]
Li, Shudong [1 ]
Jia, Yan [3 ]
机构
[1] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 510006, Peoples R China
[2] Natl Univ Def Technol, Comp Sch, Changsha 410073, Peoples R China
[3] Cyberspace Secur Res Ctr, Peng Cheng Lab, Shenzhen 518000, Peoples R China
关键词
Topic representation model; Behavior analysis; Word distribution; LDA model; Topic detection; INTERNET;
D O I
10.1007/s11280-020-00822-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of microblogging, it has become an important way for people to obtain information, express opinions, and make suggestions. Identifying new topics quickly and accurately from the massive microblogging data plays a crucial role for recommending information and controlling public opinion. The topic representation model provides a basis for topic detection. In this paper, we propose a topic representation model based on user behavior analysis, i.e., microblogging behavior analysis-latent Dirichlet allocation (MBA-LDA) model, for microblogging datasets. Topic-word distribution is acquired by the LDA model which considers information on user behaviors (such as posting, forwarding and commenting) and word distribution among documents within one topic and among different topics. The model also re-assesses the importance of words in topic representation. The basic idea is that the distribution of words within a topic or among different topics has a great influence on the selection of topic expression words. If a word is evenly distributed among all documents of a certain topic, it indicates that the word is the common word of all documents in the topic, and it is more suitable to represent this topic. If a word is more evenly distributed among various topics, it indicates that the word is the common word of all topics, and it can't achieve the purpose of distinguishing topics, so it is less suitable to represent any topic. By experiments with Sina Microblogging's actual data set, the topic model based on the MBA-LDA algorithm makes the representative words more important and increases the differentiation of topic words, which effectively improves the accuracy of subsequent topic detection and evolutionary analysis.
引用
收藏
页码:3083 / 3097
页数:15
相关论文
共 50 条
  • [21] Web Topic Representation Based on Multi-layer Semantic Model
    Shi, Peng
    Hu, Changjun
    Zhao, Ruopeng
    Ding, Lianhong
    ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 2, 2008, : 244 - +
  • [22] Topic-oriented measurement of microblogging network
    Liu, Wei
    Wang, Li-Hong
    Li, Rui-Guang
    Tongxin Xuebao/Journal on Communications, 2013, 34 (11): : 171 - 178
  • [23] Topic-Based Place Semantics Discovered from Microblogging Text Messages
    Kim, Eunyoung
    Ihm, Hwon
    Myaeng, Sung-Hyon
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 561 - 562
  • [24] Topic-level opinion influence model (TOIM): An investigation using tencent microblogging
    Li, Daifeng
    Tang, Jie
    Ding, Ying
    Shuai, Xin
    Chambers, Tamy
    Sun, Guozheng
    Luo, Zhipeng
    Zhang, Jingwei
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2015, 66 (12) : 2657 - 2673
  • [25] Topic Oriented Semantic Parsing A topic based question representation
    Sharma, Lokesh Kumar
    Mittal, Namita
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 159 - 164
  • [26] Dynamic scene analysis based on the topic model
    Fan, Yawen
    Zheng, Shibao
    2013 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND MEASUREMENT, SENSOR NETWORK AND AUTOMATION (IMSNA), 2013, : 436 - 439
  • [27] Rumor Identification in Microblogging Systems Based on Users' Behavior
    Liang, Gang
    He, Wenbo
    Xu, Chun
    Chen, Liangyin
    Zeng, Jinquan
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2015, 2 (03) : 99 - 108
  • [28] A Probabilistic model for compact document topic representation
    Berenyi, Zsolt
    Vajk, Istvan
    PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON SIMULATION, MODELLING AND OPTIMIZATION, 2009, : 322 - +
  • [29] Query Classification using LDA Topic Model and Sparse Representation Based Classifier
    Bhattacharya, Indrani
    Sil, Jaya
    PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,
  • [30] On Topic Aware Recommendation to Increase Popularity in Microblogging Services
    Litou, Iouliana
    Kalogeraki, Vana
    Gunopulos, Dimitrios
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2016 CONFERENCES, 2016, 10033 : 673 - 681