Aspect Based Multi-Document Summarization

被引:0
|
作者
Sahoo, Deepak [1 ]
Balabantaray, Rakesh [1 ]
Phukon, Mridumoni [2 ]
Saikia, Saibali [2 ]
机构
[1] IIIT Bhubaneswar, Dept Comp Sci & Engn, Bhubaneswar, Odisha, India
[2] Gauhati Univ, GUIST, Dept IT, Gauhati, Assam, India
关键词
Summarization; Clusteri; Term Weigt; Positional Weigh; Chronological Weight; Aspect Weight;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-document summarization is useful when a user deals with a group of heterogeneous documents and wants to compile the important information present in the collection, or there is a group of homogeneous documents, taken out from a large corpus as a result of a query. We present an approach to automatic multi-document summarization that depends on clustering and sentence extraction. User provides a query, based on the query; documents that are relevant to the query are extracted from a document corpus containing documents from various domains. An n x n similarity matrix is created among the sentences having sentence level similarity in all extracted documents. Then clusters of similar sentences are formed using Markov clustering algorithm. In each cluster, each sentence is assigned five different weights 1. Chronological weight of sentence (Document level) 2. Position weight of sentence (position of sentence in the document) 3. Sentence weight (based on term weight) 4. Aspect based weight (sentence containing aspect words) and 5. Synonymy and Hyponym Weight. Then top ranked sentences having highest weight are extracted from each cluster and presented to user.
引用
收藏
页码:873 / 877
页数:5
相关论文
共 50 条
  • [1] OPENASP: A Benchmark for Multi-document Open Aspect-based Summarization
    Amar, Shmuel
    Schiff, Liat
    Ernst, Ori
    Shefer, Asi
    Shapira, Ori
    Dagan, Ido
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1967 - 1990
  • [2] A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization
    Parnell, Jacob
    Unanue, Inigo Jauregi
    Piccardi, Massimo
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5112 - 5128
  • [3] Multi-document summarization based on lexical chains
    Chen, YM
    Wang, XL
    Liu, BQ
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 1937 - 1942
  • [4] Genetic algorithm based multi-document summarization
    Liu, Dexi
    He, Yanxiang
    Ji, Donghong
    Yang, Hua
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1140 - 1144
  • [5] Multi-document summarization based on unsupervised clustering
    Ji, Paul
    INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
  • [6] Geodesic Distance based Multi-document Summarization
    Ma, Huifang
    He, Qing
    Shi, Zhongzhi
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 54 - 59
  • [7] Multi-document summarization based on the Yago ontology
    Baralis, Elena
    Cagliero, Luca
    Jabeen, Saima
    Fiori, Alessandro
    Shah, Sajid
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (17) : 6976 - 6984
  • [8] SUBTOPIC-BASED MULTI-DOCUMENT SUMMARIZATION
    Dai, Lin
    Tang, Ji-Liang
    Xia, Yun-Qing
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 3505 - +
  • [9] Multi-document summarization based on concept space
    Tang, STK
    Yen, J
    Yang, CC
    ITRE2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: RESEARCH AND EDUCATION, 2003, : 385 - 389
  • [10] Multi-document Summarization Based on Sentence Clustering
    Zheng, Hai-Tao
    Gong, Shu-Qin
    Chen, Hao
    Jiang, Yong
    Xia, Shu-Tao
    NEURAL INFORMATION PROCESSING (ICONIP 2014), PT II, 2014, 8835 : 429 - 436