Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization

被引:0
|
作者
Divya, S. [1 ]
Sripriya, N. [1 ]
Andrew, J. [2 ]
Mazzara, Manuel [3 ]
机构
[1] SSN Coll Engn, Dept Informat Technol, Kalavakkam, Tamil Nadu, India
[2] Manipal Acad Higher Educ, Dept Comp Sci & Engn, Manipal Inst Technol, Manipal, Karnataka, India
[3] Innopolis Univ, Inst Software Dev & Engn, Innopolis, Russia
关键词
Document summarization; BERT; CNN; Transformer models; Abstractive summarization;
D O I
10.7717/peerj-cs.2424
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human- established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [41] Long Document Extractive Summarization Method Based on Pre-training Model and Transformer
    Zhou, Xinxin
    Guo, Yuechen
    Huang, Yuning
    Yan, Yuming
    Li, Maoyuan
    Journal of Network Intelligence, 2023, 8 (03): : 913 - 931
  • [42] Knowledge-Enhanced Transformer Graph Summarization (KETGS): Integrating Entity and Discourse Relations for Advanced Extractive Text Summarization
    Onan, Aytug
    Alhumyani, Hesham
    MATHEMATICS, 2024, 12 (23)
  • [43] EcForest: Extractive document summarization through enhanced sentence embedding and cascade forest
    Yang, Kang
    He, Hongye
    Al-Sabahi, Kamal
    Zhang, Zuping
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):
  • [44] Enhanced Graph Based Approach for Multi Document Summarization
    Hariharan, Shanmugasundaram
    Ramkumar, Thirunavukarasu
    Srinivasan, Rengaramanujam
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2013, 10 (04) : 334 - 341
  • [45] Two-Phase Machine Learning Approach for Extractive Single Document Summarization
    Priya, A. R. Manju
    Gupta, Deepa
    COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING, 2020, 1108 : 871 - 881
  • [46] Explainable Sentiment Analysis: A Hierarchical Transformer-Based Extractive Summarization Approach
    Bacco, Luca
    Cimino, Andrea
    Dell'Orletta, Felice
    Merone, Mario
    ELECTRONICS, 2021, 10 (18)
  • [47] Multi-Document Extractive Text Summarization via Deep Learning Approach
    Rezaei, Afsaneh
    Dami, Sina
    Daneshjoo, Parisa
    2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 680 - 685
  • [48] An Extractive Multi-Document Summarization Technique Based on Fuzzy Logic approach
    Tsoumou, Evrard Stency Larys
    Yang, Shichong
    Lai, Linjing
    Varus, Mbembo Loundou
    2016 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2016, : 346 - 351
  • [49] Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures
    Bandaru, Rajesh
    Radhika, Dr. Y.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 844 - 852
  • [50] An Extraction-Abstraction Hybrid Approach for Long Document Summarization
    Huang, Si
    Wang, Rui
    Xie, Qing
    Li, Lin
    Liu, Yongjian
    2019 6TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC AND SOCIO-CULTURAL COMPUTING (BESC 2019), 2019,