HICC: an entropy splitting-based framework for hierarchical co-clustering

被引:0
|
作者
Wei Cheng
Xiang Zhang
Feng Pan
Wei Wang
机构
[1] University of North Carolina at Chapel Hill,Department of Computer Science
[2] Case Western Reserve University,Department of Electrical Engineering and Computer Science
[3] Microsoft,Department of Computer Science
[4] University of California,undefined
来源
关键词
Co-clustering; Entropy; Contingency table; Text analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Two-dimensional contingency tables or co-occurrence matrices arise frequently in various important applications such as text analysis and web-log mining. As a fundamental research topic, co-clustering aims to generate a meaningful partition of the contingency table to reveal hidden relationships between rows and columns. Traditional co-clustering algorithms usually produce a predefined number of flat partition of both rows and columns, which do not reveal relationship among clusters. To address this limitation, hierarchical co-clustering algorithms have attracted a lot of research interests recently. Although successful in various applications, the existing hierarchical co-clustering algorithms are usually based on certain heuristics and do not have solid theoretical background. In this paper, we present a new co-clustering algorithm, HICC, with solid theoretical background. It simultaneously constructs a hierarchical structure of both row and column clusters, which retains sufficient mutual information between rows and columns of the contingency table. An efficient and effective greedy algorithm is developed, which grows a co-cluster hierarchy by successively performing row-wise or column-wise splits that lead to the maximal mutual information gain. Extensive experiments on both synthetic and real datasets demonstrate that our algorithm can reveal essential relationships of row (and column) clusters and has better clustering precision than existing algorithms. Moreover, the experiments on real dataset show that HICC can effectively reveal hidden relationships between rows and columns in the contingency table.
引用
收藏
页码:343 / 367
页数:24
相关论文
共 50 条
  • [1] HICC: an entropy splitting-based framework for hierarchical co-clustering
    Cheng, Wei
    Zhang, Xiang
    Pan, Feng
    Wang, Wei
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 46 (02) : 343 - 367
  • [2] Agglomerative hierarchical co-clustering based on Bregman divergence
    Shen, Guowei, 1600, Springer Verlag (287):
  • [3] HCC: A Hierarchical Co-Clustering Algorithm
    Li, Jingxuan
    Li, Tao
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 861 - 862
  • [4] A scalable collaborative filtering framework based on co-clustering
    George, T
    Merugu, S
    FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 625 - 628
  • [5] SPLITTING METHODS FOR CONVEX BI-CLUSTERING AND CO-CLUSTERING
    Weylandt, Michael
    2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 237 - 242
  • [6] Splitting-based hierarchical identification for solving Lyapunov equations
    Gu, Chuan-Qing
    Xue, Hui-Yan
    PROCEEDINGS OF THE 14TH CONFERENCE OF INTERNATIONAL LINEAR ALGEBRA SOCIETY, 2007, : 47 - 50
  • [7] CLR: A Collaborative Location Recommendation Framework based on Co-Clustering
    Leung, Kenneth Wai-Ting
    Lee, Dik Lun
    Lee, Wang-Chien
    PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 305 - 314
  • [8] Hierarchical co-clustering for web queries and selected URLs
    Hosseini, Mehdi
    Abolhassani, Hassan
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2007, PROCEEDINGS, 2007, 4831 : 653 - 662
  • [9] Hierarchical and Overlapping Co-Clustering of mRNA: miRNA Interactions
    Pio, Gianvito
    Ceci, Michelangelo
    Loglisci, Corrado
    D'Elia, Domenica
    Malerba, Donato
    20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 654 - +
  • [10] Distributed Hierarchical Co-clustering and Collaborative Filtering Algorithm
    Narang, Ankur
    Srivastava, Abhinav
    Katta, Naga Praveen Kumar
    2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,