Event Graph-Based News Clustering: The Role of Named Entity-Centered Subgraphs

被引:0
|
作者
Komecoglu, Basak Buluz [1 ]
Yilmaz, Burcu [1 ]
机构
[1] Gebze Tech Univ, Inst Informat Technol, TR-41400 Gebze, Kocaeli, Turkiye
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Task analysis; Clustering algorithms; Vectors; Context modeling; Computational modeling; Analytical models; Semantics; Natural language processing; Text processing; Frequent subgraph mining; low-resource language; natural language processing; text clustering; TOPIC DETECTION; SIMILARITY;
D O I
10.1109/ACCESS.2024.3435343
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In an era of exponential growth in online news sources, the need for intelligent digital solutions capable of efficiently analyzing and organizing large amounts of news content has become crucial. This paper presents a graph-based methodology designed to enhance Topic Detection and Tracking (TDT) tasks in natural language processing by efficiently clustering news events into coherent stories. The proposed approach leverages a novel event graph model that captures not only the characteristics of individual news events but also their collective narrative context. Using Named Entity Centred Frequent Subgraphs, the model excels in identifying recurring patterns of events and thus provides a framework for learning a robust, language-independent, and structured representation for structuring news stories, which represents a significant advance in the refinement of traditional clustering algorithms. Empirical experiments using a multilingual benchmark dataset, the News Clustering Dataset, highlight the superior clustering performance of our approach compared to state-of-the-art monolingual document clustering techniques, particularly in English and the competitive results in Spanish. To underline the adaptability of the methodology to low-resource languages, the Turkish 'Story-Based News Dataset' developed specifically for this study also promises to serve as an important resource for a wide range of natural language processing tasks.
引用
收藏
页码:105613 / 105632
页数:20
相关论文
共 50 条
  • [11] GLARA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition
    Zhao, Xinyan
    Ding, Haibo
    Feng, Zhe
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3636 - 3649
  • [12] A News Event Clustering Algorithm based on Semantic Relationship Graph
    Liu Zhikang
    Cheng Chunling
    2018 SIXTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2018, : 100 - 105
  • [13] Graph-Based Clustering with Constraints
    Anand, Rajul
    Reddy, Chandan K.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6635 : 51 - 62
  • [14] Improvement of Graph based Named Entity Disambiguation
    Yang, Xiao
    Qin, Su-Juan
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2016, 71 : 960 - 963
  • [15] Named Entity Recognition based on a Graph Structure
    Munoz, David
    Perez, Fernando
    Pinto, David
    COMPUTACION Y SISTEMAS, 2020, 24 (02): : 553 - 563
  • [16] Combining Textual and Graph-Based Features for Named Entity Disambiguation Using Undirected Probabilistic Graphical Models
    Hakimov, Sherzod
    ter Horst, Hendrik
    Jebbara, Soufian
    Hartung, Matthias
    Cimiano, Philipp
    KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, EKAW 2016, 2016, 10024 : 288 - 302
  • [17] Graph Clustering: a graph-based clustering algorithm for the electromagnetic calorimeter in LHCb
    Canudas, Nuria Valls
    Gomez, Miriam Calvo
    Vilasis-Cardona, Xavier
    Ribe, Elisabet Golobardes
    EUROPEAN PHYSICAL JOURNAL C, 2023, 83 (02):
  • [18] Benchmarking graph-based clustering algorithms
    Foggia, P.
    Percannella, G.
    Sansone, C.
    Vento, M.
    IMAGE AND VISION COMPUTING, 2009, 27 (07) : 979 - 988
  • [19] Graph Clustering: a graph-based clustering algorithm for the electromagnetic calorimeter in LHCb
    Núria Valls Canudas
    Míriam Calvo Gómez
    Xavier Vilasís-Cardona
    Elisabet Golobardes Ribé
    The European Physical Journal C, 83
  • [20] Graph-based hierarchical conceptual clustering
    Jonyer, I
    Cook, DJ
    Holder, LB
    JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (01) : 19 - 43