Event Graph-Based News Clustering: The Role of Named Entity-Centered Subgraphs

被引:0
|
作者
Komecoglu, Basak Buluz [1 ]
Yilmaz, Burcu [1 ]
机构
[1] Gebze Tech Univ, Inst Informat Technol, TR-41400 Gebze, Kocaeli, Turkiye
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Task analysis; Clustering algorithms; Vectors; Context modeling; Computational modeling; Analytical models; Semantics; Natural language processing; Text processing; Frequent subgraph mining; low-resource language; natural language processing; text clustering; TOPIC DETECTION; SIMILARITY;
D O I
10.1109/ACCESS.2024.3435343
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In an era of exponential growth in online news sources, the need for intelligent digital solutions capable of efficiently analyzing and organizing large amounts of news content has become crucial. This paper presents a graph-based methodology designed to enhance Topic Detection and Tracking (TDT) tasks in natural language processing by efficiently clustering news events into coherent stories. The proposed approach leverages a novel event graph model that captures not only the characteristics of individual news events but also their collective narrative context. Using Named Entity Centred Frequent Subgraphs, the model excels in identifying recurring patterns of events and thus provides a framework for learning a robust, language-independent, and structured representation for structuring news stories, which represents a significant advance in the refinement of traditional clustering algorithms. Empirical experiments using a multilingual benchmark dataset, the News Clustering Dataset, highlight the superior clustering performance of our approach compared to state-of-the-art monolingual document clustering techniques, particularly in English and the competitive results in Spanish. To underline the adaptability of the methodology to low-resource languages, the Turkish 'Story-Based News Dataset' developed specifically for this study also promises to serve as an important resource for a wide range of natural language processing tasks.
引用
收藏
页码:105613 / 105632
页数:20
相关论文
共 50 条
  • [1] Predicate Clustering-Based Entity-Centered Graph Pattern Recognition for Query Extension on the LOD
    Kim, Jongmo
    Kong, Junsik
    Park, Daeun
    Sohn, Mye
    INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING, IMIS-2018, 2019, 773 : 159 - 170
  • [2] Graph-Based Named Entity Linking with Wikipedia
    Hachey, Ben
    Radford, Will
    Curran, James R.
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2011, 2011, 6997 : 213 - +
  • [3] Entity Co-occurrence Graph-Based Clustering for Twitter Event Detection
    Manaskasemsak, Bundit
    Netsiwawichian, Natthakit
    Rungsawang, Arnon
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 2, AINA 2024, 2024, 200 : 344 - 355
  • [4] Enhancement of Medical Named Entity Recognition Using Graph-based Features
    Keretna, Sara
    Lim, Chee Peng
    Creighton, Doug
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 1895 - 1900
  • [5] Graph-Based Clustering Approach for Economic and Financial Event Detection Using News Analytics Data
    Sidorov, Sergei P.
    Faizliev, Alexey R.
    Levshunov, Michael
    Chekmareva, Alfia
    Gudkov, Alexander
    Korobov, Eugene
    SOCIAL INFORMATICS (SOCINFO 2018), PT II, 2018, 11186 : 271 - 280
  • [6] NESM: a Named Entity based Proximity Measure for Multilingual News Clustering
    Montalvo, Soto
    Fresno, Victor
    Martinez, Raquel
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (48): : 81 - 88
  • [7] Named entity linking in microblog posts using graph-based centrality scoring
    Kalloubi, Fahd
    Nfaoui, El Habib
    El Beqqali, Omar
    2014 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA'14), 2014,
  • [8] A Token-wise Graph-based Framework for Multimodal Named Entity Recognition
    Zhang, Zhengxuan
    Mai, Weixing
    Xiong, Haoliang
    Wu, Chuhan
    Xue, Yun
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2153 - 2158
  • [9] WebKey: a graph-based method for event detection in web news
    Rasouli, Elham
    Zarifzadeh, Sajjad
    Rafsanjani, Amir Jahangard
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2020, 54 (03) : 585 - 604
  • [10] WebKey: a graph-based method for event detection in web news
    Elham Rasouli
    Sajjad Zarifzadeh
    Amir Jahangard Rafsanjani
    Journal of Intelligent Information Systems, 2020, 54 : 585 - 604