Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

被引:0
|
作者
Pasunuru, Ramakanth [1 ]
Celikyilmaz, Asli [2 ]
Galley, Michel [2 ]
Xiong, Chenyan [2 ]
Zhang, Yizhe [2 ]
Bansal, Mohit [1 ]
Gao, Jianfeng [2 ]
机构
[1] Univ N Carolina, Chapel Hill, NC 27599 USA
[2] Microsoft Res, Redmond, WA USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient large-scale high-quality training datasets. We present two QMDS training datasets, which we construct using two data augmentation methods: (1) transferring the commonly used single-document CNN/Daily Mail summarization dataset to create the QMDSCNN dataset, and (2) mining search-query logs to create the QMDSIR dataset. These two datasets have complementary properties, i.e., QMDSCNN has real summaries but queries are simulated, while QMDSIR has real queries but simulated summaries. To cover both these real summary and query aspects, we build abstractive end-to-end neural network models on the combined datasets that yield new state-of-the-art transfer results on DUC datasets. We also introduce new hierarchical encoders that enable a more efficient encoding of the query together with multiple documents. Empirical results demonstrate that our data augmentation and encoding methods outperform baseline models on automatic metrics, as well as on human evaluations along multiple attributes.
引用
收藏
页码:13666 / 13674
页数:9
相关论文
共 50 条
  • [21] A Lexical Chain approach for update-style query-focused multi-document summarization
    Li, Jing
    Sun, Le
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 310 - 320
  • [22] A Context-Sensitive Manifold Ranking Approach to Query-Focused Multi-document Summarization
    Cai, Xiaoyan
    Li, Wenjie
    PRICAI 2010: TRENDS IN ARTIFICIAL INTELLIGENCE, 2010, 6230 : 27 - 38
  • [23] Abstractive Multi-Document Summarization
    Ranjitha, N. S.
    Kallimani, Jagadish S.
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1690 - 1693
  • [24] A New Feature-Fusion Sentence Selecting Strategy for Query-Focused Multi-Document Summarization
    He, Tingting
    Li, Fang
    Shao, Wei
    Chen, Jinguang
    Ma, Liang
    ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 81 - 86
  • [25] Graph-Based Query-Focused Multi-document Summarization Using Improved Affinity Graph
    Hu, Po
    He, Jiacong
    Zhang, Yong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2015, 2015, 9403 : 336 - 347
  • [26] A Graph Based Query Focused Multi-Document Summarization
    Balaji, J.
    Geetha, T.
    Parthasarathi, Ranjani
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2014, 10 (01) : 16 - 41
  • [27] Coarse-to-Fine Query Focused Multi-Document Summarization
    Xu, Yumo
    Lapata, Mirella
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3632 - 3645
  • [28] Using Contextual Topic Model for a Query-Focused Multi-Document Summarizer
    Yang, Guangbing
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2016, 25 (01)
  • [29] Using Proximity in Query Focused Multi-document Extractive Summarization
    Li, Sujian
    Zhang, Yu
    Wang, Wei
    Wang, Chen
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 179 - 188
  • [30] Disentangling Specificity for Abstractive Multi-document Summarization
    Ma, Congbo (congbo.ma@mq.edu.au), 1600, Institute of Electrical and Electronics Engineers Inc.