Modeling Chinese Microblogs with Five Ws for Topic Hashtags Extraction

被引:0
|
作者
Zhibin Zhao [1 ]
Jiahong Sun [1 ]
Lan Yao [1 ]
Xun Wang [1 ]
Jiahong Chu [1 ]
Huan Liu [1 ]
Ge Yu [1 ]
机构
[1] College of Computer Science and Engineering, Northeastern University
基金
中国国家自然科学基金;
关键词
hashtag; microblog; topic detection; short-message-style news; five Ws;
D O I
暂无
中图分类号
TP393.092 []; TP391.1 [文字信息处理];
学科分类号
080402 ; 081203 ; 0835 ;
摘要
Hashtags are important metadata in microblogs and are used to mark topics or index messages. However,statistics show that hashtags are absent from most microblogs. This poses great challenges for the retrieval and analysis of these tagless microblogs. In this paper, we summarize the similarity between microblogs and shortmessage-style news, and then propose an algorithm, named 5WTAG, for detecting microblog topics based on a model of five Ws(When, Where, Who, What, ho W). As five-W attributes are the core components in event description, it is guaranteed theoretically that 5WTAG can properly extract semantic topics from microblogs. We introduce the detailed procedure of the algorithm in this paper including spam microblog identification, microblog segmentation, and candidate hashtag construction. In addition, we propose a novel recommendation computing method for ranking candidate hashtags, which combines syntax and semantic analysis and observes the distribution of artificial topic hashtags. Finally, we conduct comprehensive experiments to verify the semantic correctness and completeness of the candidate hashtags, as well as the accuracy of the recommendation method using real data from Sina Weibo.
引用
收藏
页码:135 / 148
页数:14
相关论文
共 50 条
  • [1] Modeling Chinese microblogs with five Ws for topic hashtags extraction
    Zhao Z.
    Sun J.
    Yao L.
    Wang X.
    Chu J.
    Liu H.
    Yu G.
    Yao, Lan (yaolan@cse.neu.edu.cn), 1600, Tsinghua University (22): : 135 - 148
  • [2] Modeling Chinese Microblogs with Five Ws for Topic Hashtags Extraction
    Zhao, Zhibin
    Sun, Jiahong
    Yao, Lan
    Wang, Xun
    Chu, Jiahong
    Liu, Huan
    Yu, Ge
    TSINGHUA SCIENCE AND TECHNOLOGY, 2017, 22 (02) : 135 - 148
  • [3] Modeling Chinese Microblogs with Five Ws for Topic Hashtags Extraction
    Zhibin Zhao
    Jiahong Sun
    Lan Yao
    Xun Wang
    Jiahong Chu
    Huan Liu
    Ge Yu
    Tsinghua Science and Technology, 2017, 22 (02) : 135 - 148
  • [4] Determining the Topic Hashtags for Chinese Microblogs Based on 5W Model
    Zhao, Zhibin
    Sun, Jiahong
    Mao, Zhenyu
    Feng, Shi
    Bao, Yubin
    BIG DATA COMPUTING AND COMMUNICATIONS, (BIGCOM 2016), 2016, 9784 : 55 - 67
  • [5] Improving Interpretations of Topic Modeling in Microblogs
    Alkhodair, Sarah A.
    Fung, Benjamin C. M.
    Rahman, Osmud
    Hung, Patrick C. K.
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2018, 69 (04) : 528 - 540
  • [6] Making Recommendations on Microblogs through Topic Modeling
    Chen, Chaochao
    Zheng, Xiaolin
    Zhou, Chaofei
    Chen, Deren
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013 WORKSHOPS, 2014, 8182 : 252 - 265
  • [7] Topic thread extraction from search results of microblogs
    School of Computer Science and Technology, Wuhan University of Technology, Wuhan, 430070, Hubei, China
    不详
    Li, L., 1600, Asian Network for Scientific Information (12):
  • [8] Combining IR and LDA Topic Modeling for Filtering Microblogs
    Hajjem, Malek
    Latiri, Chiraz
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS, 2017, 112 : 761 - 770
  • [9] CMiner: Opinion Extraction and Summarization for Chinese Microblogs
    Zhou, Xinjie
    Wan, Xiaojun
    Xiao, Jianguo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (07) : 1650 - 1663
  • [10] Detection and Extraction of Hot Topics on Chinese Microblogs
    Yang, Liang
    Lin, Hongfei
    Lin, Yuan
    Liu, Shengbo
    COGNITIVE COMPUTATION, 2016, 8 (04) : 577 - 586