Event Graph-Based News Clustering: The Role of Named Entity-Centered Subgraphs

被引:0
|
作者
Komecoglu, Basak Buluz [1 ]
Yilmaz, Burcu [1 ]
机构
[1] Gebze Tech Univ, Inst Informat Technol, TR-41400 Gebze, Kocaeli, Turkiye
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Task analysis; Clustering algorithms; Vectors; Context modeling; Computational modeling; Analytical models; Semantics; Natural language processing; Text processing; Frequent subgraph mining; low-resource language; natural language processing; text clustering; TOPIC DETECTION; SIMILARITY;
D O I
10.1109/ACCESS.2024.3435343
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In an era of exponential growth in online news sources, the need for intelligent digital solutions capable of efficiently analyzing and organizing large amounts of news content has become crucial. This paper presents a graph-based methodology designed to enhance Topic Detection and Tracking (TDT) tasks in natural language processing by efficiently clustering news events into coherent stories. The proposed approach leverages a novel event graph model that captures not only the characteristics of individual news events but also their collective narrative context. Using Named Entity Centred Frequent Subgraphs, the model excels in identifying recurring patterns of events and thus provides a framework for learning a robust, language-independent, and structured representation for structuring news stories, which represents a significant advance in the refinement of traditional clustering algorithms. Empirical experiments using a multilingual benchmark dataset, the News Clustering Dataset, highlight the superior clustering performance of our approach compared to state-of-the-art monolingual document clustering techniques, particularly in English and the competitive results in Spanish. To underline the adaptability of the methodology to low-resource languages, the Turkish 'Story-Based News Dataset' developed specifically for this study also promises to serve as an important resource for a wide range of natural language processing tasks.
引用
收藏
页码:105613 / 105632
页数:20
相关论文
共 50 条
  • [21] Graph-based Medical Image Clustering
    Li, Jian
    Pan, Haiwei
    Zhang, Minghui
    Han, Qilong
    Feng, Xiaoning
    2012 8TH INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORKING TECHNOLOGY (ICCNT, INC, ICCIS AND ICMIC), 2012, : 153 - 158
  • [22] A GRAPH-BASED APPROACH FOR SEMISUPERVISED CLUSTERING
    Yoshida, Tetsuya
    COMPUTATIONAL INTELLIGENCE, 2014, 30 (02) : 263 - 284
  • [23] Graph-based data clustering with overlaps
    Fellows, Michael R.
    Guo, Jiong
    Komusiewicz, Christian
    Niedermeier, Rolf
    Uhlmann, Johannes
    DISCRETE OPTIMIZATION, 2011, 8 (01) : 2 - 17
  • [24] Graph-Based Clustering of Dolphin Whistles
    Kipnis, Dror
    Diamant, Roee
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2216 - 2227
  • [25] Graph-Based Data Clustering with Overlaps
    Fellows, Michael R.
    Guo, Jiong
    Komusiewicz, Christian
    Niedermeier, Rolf
    Uhlmann, Johannes
    COMPUTING AND COMBINATORICS, PROCEEDINGS, 2009, 5609 : 516 - +
  • [26] Fuzzy Named Entity-Based Document Clustering
    Cao, Tru H.
    Do, Hai T.
    Hong, Dung T.
    Quan, Tho T.
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 2030 - 2036
  • [27] Frequent approximate subgraphs as features for graph-based image classification
    Acosta-Mendoza, Niusvel
    Gago-Alonso, Andres
    Medina-Pagola, Jose E.
    KNOWLEDGE-BASED SYSTEMS, 2012, 27 : 381 - 392
  • [28] Cross-lingual event-centered news clustering based on elements semantic correlations of different news
    Hong, Xudong
    Yu, Zhengtao
    Tang, Moming
    Xian, Yantuan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (23) : 25129 - 25143
  • [29] Cross-lingual event-centered news clustering based on elements semantic correlations of different news
    Xudong Hong
    Zhengtao Yu
    Moming Tang
    Yantuan Xian
    Multimedia Tools and Applications, 2017, 76 : 25129 - 25143
  • [30] Connection density based clustering: A graph-based density clustering method
    Xu, Feng
    Cai, Mingjie
    Li, Qingguo
    Zhou, Jie
    Fujita, Hamido
    APPLIED SOFT COMPUTING, 2024, 161