Enhancing diversity and coverage of document summaries through subspace clustering and clustering-based optimization

被引:7
|
作者
Cai, Xiaoyan [1 ]
Li, Wenjie [2 ]
Zhang, Renxian [3 ]
机构
[1] Northwest Agr & Forestry Univ, Coll Informat Engn, Yangling, Shaanxi, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[3] Tongji Univ, Dept Comp Sci & Technol, Shanghai 200092, Peoples R China
基金
中国国家自然科学基金;
关键词
Document summarization; Information diversity; Information coverage; Subspace clustering;
D O I
10.1016/j.ins.2014.04.028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentence clustering has been successfully applied in document summarization to discover the topics conveyed in a collection of documents. However, existing clustering-based summarization approaches are seldom targeted for both diversity and coverage of summaries, which are believed to be the two key issues to determine the quality of summaries. The focus of this work is to explore a systematic approach that allows diversity and coverage to be tackled within an integrated clustering-based summarization framework. Given the fact that normally each topic can be described by a set of keywords and the choice of the keywords among the topics is topic-dependent, we take the advantage of the newly emerged subspace clustering to enable the flexibility of keyword selection and the improved quality of sentence clustering. On this basis, we develop two clustering-based optimization strategies, namely local optimization and global optimization to pursue our targets. Experimental results on the DUC datasets demonstrate effectiveness and robustness of the proposed approach. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:764 / 775
页数:12
相关论文
共 50 条
  • [21] Clustering-Based Subset Selection in Evolutionary Multiobjective Optimization
    Chen, Weiyu
    Ishibuchi, Hisao
    Shang, Ke
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 468 - 475
  • [22] Development of a clustering-based model for enhancing acoustic leak detection
    El-Zahab, Samer
    Asaad, Ahmed
    Abdelkader, Eslam Mohammed
    Zayed, Tarek
    CANADIAN JOURNAL OF CIVIL ENGINEERING, 2019, 46 (04) : 278 - 286
  • [23] Ontological summaries through hierarchical clustering
    Andreasen, Troels
    Bulskov, Henrik
    Terney, Thomas Vestskov
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2008, 4994 : 497 - 507
  • [24] Sparse Subspace Representation for Spectral Document Clustering
    Saha, Budhaditya
    Dinh Phung
    Pham, Duc Son
    Venkatesh, Svetha
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 1092 - 1097
  • [25] Leaf Clustering Based on Sparse Subspace Clustering
    Ding, Yun
    Yan, Qing
    Zhang, Jing-Jing
    Xun, Li-Na
    Zheng, Chun-Hou
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT II, 2016, 9772 : 55 - 66
  • [26] Clustering-based diversity improvement in top-N recommendation
    Aytekin, Tevfik
    Karakaya, Mahmut Ozge
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2014, 42 (01) : 1 - 18
  • [27] Multi-document summarization using a clustering-based hybrid strategy
    Nie, Yu
    Ji, Donghong
    Yang, Lingpeng
    Niu, Zhengyu
    He, Tingting
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 608 - 614
  • [28] A clustering-based approach-for integrating document-category hierarchies
    Cheng, Tsang-Hsiang
    Wei, Chih-Ping
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2008, 38 (02): : 410 - 424
  • [29] Clustering-Based Ensemble Pruning and Multistage Organization Using Diversity
    Zyblewski, Pawel
    Wozniak, Michal
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 287 - 298
  • [30] Clustering-based diversity improvement in top-N recommendation
    Tevfik Aytekin
    Mahmut Özge Karakaya
    Journal of Intelligent Information Systems, 2014, 42 : 1 - 18