On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

被引：0

作者：

Pilault, Jonathan ^{[1
,2
,3
]}

Li, Raymond ^{[1
]}

Subramanian, Sandeep ^{[1
,2
,4
]}

Pal, Christopher ^{[1
,2
,3
,4
,5
]}

机构：

[1] Element AI, Montreal, PQ, Canada

[2] Mila, Montreal, PQ, Canada

[3] Polytech Montreal, Montreal, PQ, Canada

[4] Univ Montreal, Montreal, PQ, Canada

[5] Canada CIFAR AI Chair, Montreal, PQ, Canada

来源：

PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher ROUGE scores. We provide extensive comparisons with strong baseline methods, prior state of the art work as well as multiple variants of our approach including those using only transformers, only extractive techniques and combinations of the two. We examine these models using four different summarization tasks and datasets: arXiv papers, PubMed papers, the Newsroom and BigPatent datasets. We find that transformer based methods produce summaries with fewer n-gram copies, leading to n-grain copying statistics that are more similar to human generated abstracts. We include a human evaluation, finding that transformers are ranked highly for coherence and fluency, but purely extractive methods score higher for informativeness and relevance. We hope that these architectures and experiments may serve as strong points of comparison for future work.

引用

页码：9308 / 9319

页数：12

共 50 条

[41] A Comprehensive Evaluation of Large Language Models for Turkish Abstractive Dialogue Summarization
Buyuk, Osman
IEEE ACCESS, 2024, 12 : 124391 - 124401
[42] Abstractive document summarization without parallel data
Nikolov, Nikola I.
Hahnloser, Richard H.R.
LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, 2020, : 6638 - 6644
[43] Abstractive Document Summarization without Parallel Data
Nikolov, Nikola, I
Hahnloser, Richard H. R.
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6638 - 6644
[44] Leveraging large language models for abstractive summarization of Italian legal news
Benedetto, Irene
Cagliero, Luca
Ferro, Michele
Tarasconi, Francesco
Bernini, Claudia
Giacalone, Giuseppe
ARTIFICIAL INTELLIGENCE AND LAW, 2025,
[45] A Joint Sentence Scoring and Selection Framework for Neural Extractive Document Summarization
Zhou, Qingyu
Yang, Nan
Wei, Furu
Huang, Shaohan
Zhou, Ming
Zhao, Tiejun
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 671 - 681
[46] Neural Abstractive Summarization with Structural Attention
Chowdhury, Tanya
Kumar, Sachin
Chakraborty, Tanmoy
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3716 - 3722
[47] Abstractive Summarization Improved by WordNet-Based Extractive Sentences
Xie, Niantao
Li, Sujian
Ren, Huiling
Zhai, Qibin
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 404 - 415
[48] Extractive Chinese spoken document summarization using probabilistic ranking models
Chen, Yi-Ting
Yu, Suhan
Wang, Hsin-Min
Chen, Berlin
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 660 - +
[49] A Survey of the State-of-the-Art Models in Neural Abstractive Text Summarization
Syed, Ayesha Ayub
Gaol, Ford Lumban
Matsuo, Tokuro
IEEE ACCESS, 2021, 9 : 13248 - 13265
[50] Query Oriented Extractive-Abstractive Summarization System (QEASS)
Girthana, K.
Swamynathan, S.
PROCEEDINGS OF THE 6TH ACM IKDD CODS AND 24TH COMAD, 2019, : 301 - 305

← 1 2 3 4 5 →