Multi-document summarization via submodularity

被引:29
|
作者
Li, Jingxuan [1 ]
Li, Lei [1 ]
Li, Tao [1 ]
机构
[1] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
基金
美国国家科学基金会;
关键词
Multi-document summarization; Submodularity; Greedy algorithm;
D O I
10.1007/s10489-012-0336-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-document summarization is becoming an important issue in the Information Retrieval community. It aims to distill the most important information from a set of documents to generate a compressed summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection techniques to extract a set of sentences from the document set as the summary. The submodularity hidden in the term coverage and the textual-unit similarity motivates us to incorporate this property into our solution to multi-document summarization tasks. In this paper, we propose a new principled and versatile framework for different multi-document summarization tasks using submodular functions (Nemhauser et al. in Math. Prog. 14(1):265-294, 1978) based on the term coverage and the textual-unit similarity which can be efficiently optimized through the improved greedy algorithm. We show that four known summarization tasks, including generic, query-focused, update, and comparative summarization, can be modeled as different variations derived from the proposed framework. Experiments on benchmark summarization data sets (e.g., DUC04-06, TAC08, TDT2 corpora) are conducted to demonstrate the efficacy and effectiveness of our proposed framework for the general multi-document summarization tasks.
引用
收藏
页码:420 / 430
页数:11
相关论文
共 50 条
  • [31] Enhancing multi-document summarization using concepts
    Rao, Pattabhi R. K.
    Devi, S. Lalitha
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (02):
  • [32] Mixture of Topic Model for Multi-document Summarization
    Liu Na
    Li Ming-xia
    Lu Ying
    Tang Xiao-jun
    Wang Hai-wen
    Xiao Peng
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 5168 - 5172
  • [33] Disentangling Specificity for Abstractive Multi-document Summarization
    Ma, Congbo (congbo.ma@mq.edu.au), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [34] Genetic algorithm based multi-document summarization
    Liu, Dexi
    He, Yanxiang
    Ji, Donghong
    Yang, Hua
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1140 - 1144
  • [35] A Game Theory Approach for Multi-document Summarization
    Ahmad, Amreen
    Ahmad, Tanvir
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (04) : 3655 - 3667
  • [36] Multi-document summarization based on unsupervised clustering
    Ji, Paul
    INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
  • [37] Geodesic Distance based Multi-document Summarization
    Ma, Huifang
    He, Qing
    Shi, Zhongzhi
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 54 - 59
  • [38] A Hybrid Topic Model for Multi-Document Summarization
    Xu, JinAn
    Liu, JiangMing
    Araki, Kenji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (05): : 1089 - 1094
  • [39] MRS for multi-document summarization by sentence extraction
    Yong-Dong Xu
    Xiao-Dong Zhang
    Guang-Ri Quan
    Ya-Dong Wang
    Telecommunication Systems, 2013, 53 : 91 - 98
  • [40] Personalized Multi-Document Summarization in information retrieval
    Yang, Xiao-Peng
    Liu, Xiao-Rong
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 4108 - +