On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

被引:0
|
作者
Pilault, Jonathan [1 ,2 ,3 ]
Li, Raymond [1 ]
Subramanian, Sandeep [1 ,2 ,4 ]
Pal, Christopher [1 ,2 ,3 ,4 ,5 ]
机构
[1] Element AI, Montreal, PQ, Canada
[2] Mila, Montreal, PQ, Canada
[3] Polytech Montreal, Montreal, PQ, Canada
[4] Univ Montreal, Montreal, PQ, Canada
[5] Canada CIFAR AI Chair, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher ROUGE scores. We provide extensive comparisons with strong baseline methods, prior state of the art work as well as multiple variants of our approach including those using only transformers, only extractive techniques and combinations of the two. We examine these models using four different summarization tasks and datasets: arXiv papers, PubMed papers, the Newsroom and BigPatent datasets. We find that transformer based methods produce summaries with fewer n-gram copies, leading to n-grain copying statistics that are more similar to human generated abstracts. We include a human evaluation, finding that transformers are ranked highly for coherence and fluency, but purely extractive methods score higher for informativeness and relevance. We hope that these architectures and experiments may serve as strong points of comparison for future work.
引用
收藏
页码:9308 / 9319
页数:12
相关论文
共 50 条
  • [1] Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization
    Divya, S.
    Sripriya, N.
    Andrew, J.
    Mazzara, Manuel
    PEERJ COMPUTER SCIENCE, 2024, 10 : 1 - 26
  • [2] Abstractive Summarization with the Aid of Extractive Summarization
    Chen, Yangbin
    Ma, Yun
    Mao, Xudong
    Li, Qing
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 3 - 15
  • [3] Integrating Extractive and Abstractive Models for Long Text Summarization
    Wang, Shuai
    Zhao, Xiang
    Li, Bo
    Ge, Bin
    Tang, Daquan
    2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 305 - 312
  • [4] Neural Latent Extractive Document Summarization
    Zhang, Xingxing
    Lapata, Mirella
    Wei, Furu
    Zhou, Ming
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 779 - 784
  • [5] Topic Attentional Neural Network for Abstractive Document Summarization
    Liu, Hao
    Zheng, Hai-Tao
    Wang, Wei
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT II, 2019, 11440 : 70 - 81
  • [6] Improving Neural Abstractive Document Summarization with Structural Regularization
    Li, Wei
    Xiao, Xinyan
    Lyu, Yajuan
    Wang, Yuanzhuo
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4078 - 4087
  • [7] A Combined Extractive With Abstractive Model for Summarization
    Liu, Wenfeng
    Gao, Yaling
    Li, Jinming
    Yang, Yuzhen
    IEEE ACCESS, 2021, 9 : 43970 - 43980
  • [8] Neural attention model with keyword memory for abstractive document summarization
    Choi, YunSeok
    Kim, Dahae
    Lee, Jee-Hyong
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (18):
  • [9] Abstractive Summarization by Neural Attention Model with Document Content Memory
    Choi, Yunseok
    Kim, Dahae
    Lee, Jee-Hyong
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 11 - 16
  • [10] Abstractive Document Summarization via Neural Model with Joint Attention
    Hou, Liwei
    Hu, Po
    Bei, Chao
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 329 - 338