CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-document Summarization

被引:6
|
作者
Kulkarni, Sayali [1 ]
Chammas, Sheide [1 ]
Zhu, Wan [1 ]
Sha, Fei [1 ]
Ie, Eugene [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
关键词
Extractive summarization; Abstractive summarization; Neural models; Transformers; Summarization dataset;
D O I
10.1007/978-3-030-86331-9_6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document summarization compress source document (s) into succinct and information-preserving text. A variant of this is query-based multi-document summarization (qmps) that targets summaries to providing specific informational needs, contextualized to the query. However, the progress in this is hindered by limited availability to large-scale datasets. In this work, we make two contributions. First, we propose an approach for automatically generated dataset for both extractive and abstractive summaries and release a version publicly. Second, we design a neural model SIBERT for extractive summarization that exploits the hierarchical nature of the input. It also infuses queries to extract query-specific summaries. We evaluate this model on CoMSum dataset showing significant improvement in performance. This should provide a baseline and enable using CoMSum for future research on qMDS.
引用
收藏
页码:84 / 98
页数:15
相关论文
共 50 条
  • [21] Mixture of Topic Model for Multi-document Summarization
    Liu Na
    Li Ming-xia
    Lu Ying
    Tang Xiao-jun
    Wang Hai-wen
    Xiao Peng
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 5168 - 5172
  • [22] Tiered sentence based topic model for multi-document summarization
    Akhtar, Nadeem
    Beg, M. M. Sufyan
    Javed, Hira
    Hussain, Md Muzakkir
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2022, 43 (08): : 2131 - 2141
  • [23] A cluster-sensitive graph model for query-oriented multi-document summarization
    Wei, Furu
    Li, Wenjie
    Lu, Qin
    He, Yanxiang
    ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 446 - +
  • [24] Query-oriented unsupervised multi-document summarization via deep learning model
    Zhong, Sheng-hua
    Liu, Yan
    Li, Bin
    Long, Jing
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (21) : 8146 - 8155
  • [25] An Improved LDA Multi-Document Summarization Model Based on TensorFlow
    Zhong, Ying
    Tang, Zhuo
    Ding, Xiaofei
    Zhu, Li
    Le, Yuquan
    Li, Kenli
    Li, Keqin
    2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 255 - 259
  • [26] A Hybrid Topic Model for Multi-Document Summarization
    Xu, JinAn
    Liu, JiangMing
    Araki, Kenji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (05): : 1089 - 1094
  • [27] A Hybrid Hierarchical Model for Multi-Document Summarization
    Celikyilmaz, Asli
    Hakkani-Tur, Dilek
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 815 - 824
  • [28] Research On Multi-document Summarization Based On LDA Topic Model
    Bian, Jinqiang
    Jiang, Zengru
    Chen, Qian
    2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL 2, 2014, : 113 - 116
  • [29] A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization
    Parnell, Jacob
    Unanue, Inigo Jauregi
    Piccardi, Massimo
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5112 - 5128
  • [30] Using query expansion in graph-based approach for query-focused multi-document summarization
    Zhao, Lin
    Wu, Lide
    Huang, Xuanjing
    INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (01) : 35 - 41