Microblog Bursty Events Detection Method Based on Multiple Word Features

被引:0
|
作者
Zhang Y.-S. [1 ,3 ]
Duan Y.-X. [1 ]
Wang J. [1 ]
Wu Y.-F. [2 ]
机构
[1] Institute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing
[2] Institute of Computational Linguistics, Peking Universit, Beijing
[3] Beijing Laboratory of National Economic Security Early-warning Engineering, Beijing
来源
关键词
Bursty events; Bursty feature words; D-S evidence theory; Hierarchical agglomerative clustering; Microblog;
D O I
10.3969/j.issn.0372-2112.2019.09.015
中图分类号
学科分类号
摘要
In recent years, a wide variety of bursty events have been occurring frequently in many fields, impacting both the stability and the development of our society. This paper proposes an event detection model based on multiple word features, which is intended to detect bursty events in the massive microblog data. The model will assist decision-makers to monitor microblogs and guide public opinions and will minimize the negative effect of bursty events to society. Firstly, the model slices the microblog data according to the time information. In each time window, the word frequency feature, the topic tag feature and the word frequency growth rate feature of each word are calculated separately. Then, the D-S evidence theory and the analytic hierarchy process are utilized to determine each word's feature weights, which are then merged to obtain the bursty feature value of the word. Words with large bursty feature value are selected to form the bursty feature word set and to construct a coupling degree matrix of bursty feature word set based on co-occurrence degree and tightness. Finally, the coupling degree matrix is used as the input of the hierarchical agglomerative clustering algorithm to generate a binary tree with bursty words being leaf nodes, and the internal similarity binary tree pruning algorithm is used to divide the clustering results. In this way, the detection of the corresponding time window's bursty events can be realized. The experimental results show that the event detection model based on bursty words has the best effect when the intra-cluster similarity threshold is 1.1, the correct rate is as high as 0.8462, the recall rate reaches 0.8684, and the F value is 0.8571, indicating the effectiveness of the proposed method. © 2019, Chinese Institute of Electronics. All right reserved.
引用
收藏
页码:1919 / 1928
页数:9
相关论文
共 17 条
  • [1] Goto J., Miyazaki T., Takei Y., Et al., Automatictweet detection based on data specified through news production, Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion, (2018)
  • [2] Zhou G., Zou H., Xiong X., Et al., MB-SinglePass: Microblog topic detection based on combined similarity, Computer Science, 39, 10, pp. 198-202, (2012)
  • [3] Qiu Y.F., Cheng L.B., Research onsudden topic detection method for microblog, Computer Engineering, 38, 9, pp. 288-290, (2012)
  • [4] Du Y., Wu W., He Y., Et al., Microblog bursty feature detection based on dynamics model, International Conference on Systems and Informatics, pp. 2304-2308, (2012)
  • [5] Salas A., Georgakis P., Petalas Y., Incident detection using data from social media, Proceedings of the 20th International Conference on Intelligent Transportation Systems, pp. 751-755, (2017)
  • [6] Schmidt A., Wiegand M., A survey on hate speech detection using natural language processing, Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, pp. 1-10, (2017)
  • [7] Kalden J.P.H., Dataanalysis within the netherlands coastguard: risk mapping, social network analysis and anomaly detection, NL ARMS Netherlands Annual Review of Military Studies 2018, pp. 193-200, (2018)
  • [8] Guo Y., Lyu X., Li Z., Bursty topics detection approach on Chinese microblog based on burst words clustering, Journal of Computer Applications, 34, 2, (2014)
  • [9] Unankard S., Li X., Sharaf M.A., Emerging event detection in social networks with location sensitivity, World Wide Web, 18, 5, pp. 1393-1417, (2015)
  • [10] Quezada M., Pea-Araya V., Poblete B., Location-aware model for news events in social media, Proceedingsof the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 935-938, (2015)