Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization

被引:0
|
作者
Divya, S. [1 ]
Sripriya, N. [1 ]
Andrew, J. [2 ]
Mazzara, Manuel [3 ]
机构
[1] SSN Coll Engn, Dept Informat Technol, Kalavakkam, Tamil Nadu, India
[2] Manipal Acad Higher Educ, Dept Comp Sci & Engn, Manipal Inst Technol, Manipal, Karnataka, India
[3] Innopolis Univ, Inst Software Dev & Engn, Innopolis, Russia
关键词
Document summarization; BERT; CNN; Transformer models; Abstractive summarization;
D O I
10.7717/peerj-cs.2424
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human- established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [31] Genetic Semantic Graph Approach for Multi-document Abstractive Summarization
    Khan, Atif
    Salim, Naomie
    Kumar, Yogan Jaya
    2015 FIFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS (ICDIPC), 2015, : 173 - 181
  • [32] A CLUSTERED SEMANTIC GRAPH APPROACH FOR MULTI-DOCUMENT ABSTRACTIVE SUMMARIZATION
    Khan, Atif
    Salim, Naomie
    Reafee, Waleed
    Sukprasert, Anupong
    Kumar, Yogan Jaya
    JURNAL TEKNOLOGI, 2015, 77 (18): : 61 - 72
  • [33] An Extractive Malayalam Document Summarization Based on Graph Theoretic Approach
    Ajmal, E. B.
    Haroon, Rosna P.
    PROCEEDINGS 2015 FIFTH INTERNATIONAL CONFERENCE ON E-LEARNING (ECONF 2015), 2015, : 237 - 240
  • [34] A hybrid model for sentence ordering in extractive multi-document summarization
    Liu, Dexi
    Zhang, Zengchang
    He, Yanxiang
    Ji, Donghong
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 588 - 592
  • [35] Extractive Chinese spoken document summarization using probabilistic ranking models
    Chen, Yi-Ting
    Yu, Suhan
    Wang, Hsin-Min
    Chen, Berlin
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 660 - +
  • [36] Clustered Genetic Semantic Graph Approach for Multi-document Abstractive Summarization
    Khan, Atif
    Salim, Naomie
    Farman, Haleem
    2016 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS ENGINEERING (ICISE), 2016, : 63 - 70
  • [37] FuzzyTP-BERT: Enhancing extractive text summarization with fuzzy topic modeling and transformer networks
    Onan, Aytug
    Alhumyani, Hesham A.
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (06)
  • [38] A topic modeled unsupervised approach to single document extractive text summarization
    Srivastava, Ridam
    Singh, Prabhav
    Rana, K. P. S.
    Kumar, Vineet
    KNOWLEDGE-BASED SYSTEMS, 2022, 246
  • [39] Building an Extractive Arabic Text Summarization Using a Hybrid Approach
    Lakhdar, Said Moulay
    Cheragui, Mohamed Amine
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 135 - 148
  • [40] A Hybrid Solution To Abstractive Multi-Document Summarization Using Supervised and Unsupervised Learning
    Bhagchandani, Gaurav
    Bodra, Deep
    Gangan, Abhishek
    Mulla, Nikahat
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 566 - 570