Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization

被引:0
|
作者
Divya, S. [1 ]
Sripriya, N. [1 ]
Andrew, J. [2 ]
Mazzara, Manuel [3 ]
机构
[1] SSN Coll Engn, Dept Informat Technol, Kalavakkam, Tamil Nadu, India
[2] Manipal Acad Higher Educ, Dept Comp Sci & Engn, Manipal Inst Technol, Manipal, Karnataka, India
[3] Innopolis Univ, Inst Software Dev & Engn, Innopolis, Russia
关键词
Document summarization; BERT; CNN; Transformer models; Abstractive summarization;
D O I
10.7717/peerj-cs.2424
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human- established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [21] EXABSUM: a new text summarization approach for generating extractive and abstractive summaries
    Zakariae Alami Merrouni
    Bouchra Frikh
    Brahim Ouhbi
    Journal of Big Data, 10
  • [22] A Fuzzy-Rough Hybrid Approach to Multi-document Extractive Summarization
    Huang, Hsun-Hui
    Yang, Horng-Chang
    Kuo, Yau-Hwang
    HIS 2009: 2009 NINTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, VOL 1, PROCEEDINGS, 2009, : 168 - +
  • [23] Abstractive Summarization: A Hybrid Approach for the Compression of Semantic Graphs
    Balaji, J.
    Geetha, T. V.
    Parthasarathi, Ranjani
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2016, 12 (02) : 76 - 99
  • [24] Extractive Document Summarization Using a Supervised Learning Approach
    Charitha, Sangaraju
    Chittaragi, Nagaratna B.
    Koolagudi, Shashidhar G.
    PROCEEDINGS OF 2018 IEEE DISTRIBUTED COMPUTING, VLSI, ELECTRICAL CIRCUITS AND ROBOTICS (DISCOVER), 2018, : 7 - 12
  • [25] Globalizing BERT-based Transformer Architectures for Long Document Summarization
    Grail, Quentin
    Perez, Julien
    Gaussier, Eric
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1792 - 1810
  • [26] Turkish abstractive text document summarization using text to text transfer transformer
    Ay, Betul
    Ertam, Fatih
    Fidan, Guven
    Aydin, Galip
    ALEXANDRIA ENGINEERING JOURNAL, 2023, 68 : 1 - 13
  • [27] A Hybrid Approach For Automatic Document Summarization
    Rani, Siji S.
    Sreejith, K.
    Sanker, Arjun
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 663 - 669
  • [28] Enhanced automatic abstractive document summarization using transformers and sentence grouping
    Toprak, Ahmet
    Turan, Metin
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (04):
  • [29] Dilated convolution for enhanced extractive summarization: A GAN-based approach with BERT word embedding
    Wu, Huimin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 4777 - 4790
  • [30] Word topical mixture models for extractive spoken document summarization
    Chen, Berlin
    Chen, Yi-Ting
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 52 - 55