Multi-document summarization via submodularity

被引:29
|
作者
Li, Jingxuan [1 ]
Li, Lei [1 ]
Li, Tao [1 ]
机构
[1] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
基金
美国国家科学基金会;
关键词
Multi-document summarization; Submodularity; Greedy algorithm;
D O I
10.1007/s10489-012-0336-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-document summarization is becoming an important issue in the Information Retrieval community. It aims to distill the most important information from a set of documents to generate a compressed summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection techniques to extract a set of sentences from the document set as the summary. The submodularity hidden in the term coverage and the textual-unit similarity motivates us to incorporate this property into our solution to multi-document summarization tasks. In this paper, we propose a new principled and versatile framework for different multi-document summarization tasks using submodular functions (Nemhauser et al. in Math. Prog. 14(1):265-294, 1978) based on the term coverage and the textual-unit similarity which can be efficiently optimized through the improved greedy algorithm. We show that four known summarization tasks, including generic, query-focused, update, and comparative summarization, can be modeled as different variations derived from the proposed framework. Experiments on benchmark summarization data sets (e.g., DUC04-06, TAC08, TDT2 corpora) are conducted to demonstrate the efficacy and effectiveness of our proposed framework for the general multi-document summarization tasks.
引用
收藏
页码:420 / 430
页数:11
相关论文
共 50 条
  • [41] Multi-document Summarization using Tensor Decomposition
    Litvak, Marina
    Vanetik, Natalia
    COMPUTACION Y SISTEMAS, 2014, 18 (03): : 581 - 589
  • [42] Multi-document Summarization for E-Learning
    Wang, Fu Lee
    Kwan, Reggie
    Hung, Sheung Lun
    HYBRID LEARNING AND EDUCATION, PROCEEDINGS, 2009, 5685 : 353 - +
  • [43] Enhancing multi-document summarization using concepts
    Pattabhi R K Rao
    S Lalitha Devi
    Sādhanā, 2018, 43
  • [44] A New Approach for Multi-Document Update Summarization
    Chong Long
    Min-Lie Huang
    Xiao-Yan Zhu
    Ming Li
    Journal of Computer Science and Technology, 2010, 25 : 739 - 749
  • [45] Identification of Event and Topic for Multi-document Summarization
    Fukumoto, Fumiyo
    Suzuki, Yoshimi
    Takasu, Atsuhiro
    Matsuyoshi, Suguru
    HUMAN LANGUAGE TECHNOLOGY: CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2016, 9561 : 304 - 316
  • [46] A New Approach for Multi-Document Update Summarization
    龙翀
    黄民烈
    朱小燕
    李明
    JournalofComputerScience&Technology, 2010, 25 (04) : 739 - 749
  • [47] Multi-document summarization based on the Yago ontology
    Baralis, Elena
    Cagliero, Luca
    Jabeen, Saima
    Fiori, Alessandro
    Shah, Sajid
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (17) : 6976 - 6984
  • [48] A Hybrid Hierarchical Model for Multi-Document Summarization
    Celikyilmaz, Asli
    Hakkani-Tur, Dilek
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 815 - 824
  • [49] SUBTOPIC-BASED MULTI-DOCUMENT SUMMARIZATION
    Dai, Lin
    Tang, Ji-Liang
    Xia, Yun-Qing
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 3505 - +
  • [50] Multi-document summarization based on concept space
    Tang, STK
    Yen, J
    Yang, CC
    ITRE2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: RESEARCH AND EDUCATION, 2003, : 385 - 389