Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization

被引:54
作者
Canhasi, Ercan [1 ]
Kononenko, Igor [1 ]
机构
[1] Univ Ljubljana, Fac Comp & Informat Sci, Ljubljana 1000, Slovenia
关键词
Query-focused document summarization; Weighted archetypal analysis; Multi-element graph; Matrix factorization; LEXRANK;
D O I
10.1016/j.eswa.2013.07.079
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing research on applying the matrix factorization approaches to query-focused multi-document summarization (Q-MDS) explores either soft/hard clustering or low rank approximation methods. We employ a different kind of matrix factorization method, namely weighted archetypal analysis (wAA) to Q-MDS. In query-focused summarization, given a graph representation of a set of sentences weighted by similarity to the given query, positively and/or negatively salient sentences are values on the weighted data set boundary. We choose to use wAA to compute these extreme values, archetypes, and hence to estimate the importance of sentences in target documents set. We investigate the impact of using the multi-element graph model for query focused summarization via wAA. We conducted experiments on the data of document understanding conference (DUC) 2005 and 2006. Experimental results evidence the improvement of the proposed approach over other closely related methods and many of state-of-the-art systems. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:535 / 543
页数:9
相关论文
共 25 条
[1]   CDDS: Constraint-driven document summarization models [J].
Alguliev, Rasim M. ;
Aliguliyev, Ramiz M. ;
Isazade, Nijat R. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (02) :458-465
[2]   GenDocSum plus MCLR: Generic document summarization based on maximum coverage and less redundancy [J].
Alguliev, Rasim M. ;
Aliguliyev, Ramiz M. ;
Hajirahimova, Makrufa S. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (16) :12460-12473
[3]   Latent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization [J].
Arora, Rachit ;
Ravindran, Balaraman .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :713-718
[4]  
Bauckhage C, 2009, LECT NOTES COMPUT SC, V5748, P272, DOI 10.1007/978-3-642-03798-6_28
[5]   Archetypal analysis of galaxy spectra [J].
Chan, BHP ;
Mitchell, DA ;
Cram, LE .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2003, 338 (03) :790-795
[6]  
Cohn A.-D., 2007, ADV NEURAL INFORM PR, V19, P297
[7]   ARCHETYPAL ANALYSIS [J].
CUTLER, A ;
BREIMAN, L .
TECHNOMETRICS, 1994, 36 (04) :338-347
[8]   LexRank: Graph-based lexical centrality as salience in text summarization [J].
Erkan, G ;
Radev, DR .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 :457-479
[9]   Weighted and robust archetypal analysis [J].
Eugster, Manuel J. A. ;
Leisch, Friedrich .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (03) :1215-1225
[10]   Automatic generic document summarization based on non-negative matrix factorization [J].
Lee, Ju-Hong ;
Park, Sun ;
Ahn, Chan-Min ;
Kim, Daeho .
INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (01) :20-34