Identification of Event and Topic for Multi-document Summarization

被引:1
|
作者
Fukumoto, Fumiyo [1 ]
Suzuki, Yoshimi [1 ]
Takasu, Atsuhiro [2 ]
Matsuyoshi, Suguru [3 ]
机构
[1] Univ Yamanashi, Grad Fac Interdisciplinary Res, Kofu, Yamanashi 4008510, Japan
[2] Natl Inst Informat, Tokyo, Japan
[3] Univ Yamanashi, Interdisciplinary Grad Sch Med & Engn, Kofu, Yamanashi, Japan
关键词
Latent Dirichlet Allocation; Moving Average Convergence/Divergence; Multi-document summarization;
D O I
10.1007/978-3-319-43808-5_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on continuous news documents and presents a method for extractive multi-document summarization. Our hypothesis about salient, key sentences in news documents is that they include words related to the target event and topic of a document. Here, an event and a topic are the same as Topic Detection and Tracking (TDT) project: an event is something that occurs at a specific place and time along with all necessary preconditions and unavoidable consequences, and a topic is defined to be "a seminal event or activity along with all directly related events and activities." The difficulty for finding topics is that they have various word distributions. In addition to the TF-IDF term weighting method to extract event words, we identified topics by using two models, i. e., Moving Average Convergence Divergence (MACD) for words with high frequencies, and Latent Dirichlet Allocation (LDA) for low frequency words. The method was tested on two datasets, NTCIR-3 Japanese news documents and DUC data, and the results showed the effectiveness of the method.
引用
收藏
页码:304 / 316
页数:13
相关论文
共 50 条
  • [1] Mixture of Topic Model for Multi-document Summarization
    Liu Na
    Li Ming-xia
    Lu Ying
    Tang Xiao-jun
    Wang Hai-wen
    Xiao Peng
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 5168 - 5172
  • [2] A Hybrid Topic Model for Multi-Document Summarization
    Xu, JinAn
    Liu, JiangMing
    Araki, Kenji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (05): : 1089 - 1094
  • [3] Using Topic Themes for Multi-Document Summarization
    Harabagiu, Sanda
    Lacatusu, Finley
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (03)
  • [4] A topic Approach to Sentence Ordering for Multi-document Summarization
    Na, Liu
    Peng, Xiao
    Ying, Lu
    Tang Xiao-jun
    Wang Hai-wen
    Li Ming-xia
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1390 - 1395
  • [5] A novel contextual topic model for multi-document summarization
    Yang, Guangbing
    Wen, Dunwei
    Kinshuk
    Chen, Nian-Shing
    Sutinen, Erkki
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (03) : 1340 - 1352
  • [6] Topic-Sensitive Multi-document Summarization Algorithm
    Liu Na
    Di Tang
    Lu Ying
    Tang Xiao-jun
    Wang Hai-wen
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 12 (04) : 1375 - 1389
  • [7] Topic-Guided Abstractive Multi-Document Summarization
    Cui, Peng
    Hu, Le
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1463 - 1472
  • [8] Topic-Sensitive Multi-document Summarization Algorithm
    Liu Na
    Tang Xiao-jun
    Lu Ying
    Li Ming-xia
    Wang Hai-wen
    Xiao Peng
    2014 SIXTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2014, : 69 - 74
  • [9] TOMDS (Topic-Oriented Multi-Document Summarization): Enabling Personalized Customization of Multi-Document Summaries
    Zhang, Xin
    Wei, Qiyi
    Song, Qing
    Zhang, Pengzhou
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [10] Tiered sentence based topic model for multi-document summarization
    Akhtar, Nadeem
    Beg, M. M. Sufyan
    Javed, Hira
    Hussain, Md Muzakkir
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2022, 43 (08): : 2131 - 2141