A parameter-less algorithm for tensor co-clustering

被引:0
|
作者
Elena Battaglia
Ruggero G. Pensa
机构
[1] University of Turin,Department of Computer Science
来源
Machine Learning | 2023年 / 112卷
关键词
Clustering; Higher-order data; Unsupervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
The majority of the data produced by human activities and modern cyber-physical systems involve complex relations among their features. Such relations can be often represented by means of tensors, which can be viewed as generalization of matrices and, as such, can be analyzed by using higher-order extensions of existing machine learning methods, such as clustering and co-clustering. Tensor co-clustering, in particular, has been proven useful in many applications, due to its ability of coping with n-modal data and sparsity. However, setting up a co-clustering algorithm properly requires the specification of the desired number of clusters for each mode as input parameters. This choice is already difficult in relatively easy settings, like flat clustering on data matrices, but on tensors it could be even more frustrating. To face this issue, we propose a new tensor co-clustering algorithm that does not require the number of desired co-clusters as input, as it optimizes an objective function based on a measure of association across discrete random variables (called Goodman and Kruskal’s τ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau$$\end{document}) that is not affected by their cardinality. We introduce different optimization schemes and show their theoretical and empirical convergence properties. Additionally, we show the effectiveness of our algorithm on both synthetic and real-world datasets, also in comparison with state-of-the-art co-clustering methods based on tensor factorization and latent block models.
引用
收藏
页码:385 / 427
页数:42
相关论文
共 50 条
  • [41] Parameter-less Population Pyramid with Automatic Feedback
    Zielinski, Adam M.
    Komarnicki, Marcin M.
    Przewozniczek, Michal W.
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 312 - 313
  • [42] Runtime Analysis for the Parameter-less Population Pyramid
    Goldman, Brian W.
    Sudholt, Dirk
    GECCO'16: PROCEEDINGS OF THE 2016 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2016, : 669 - 676
  • [43] Directional co-clustering
    Aghiles Salah
    Mohamed Nadif
    Advances in Data Analysis and Classification, 2019, 13 : 591 - 620
  • [44] Co-Clustering on Manifolds
    Gu, Quanquan
    Zhou, Jie
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 359 - 367
  • [45] Directional co-clustering
    Salah, Aghiles
    Nadif, Mohamed
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (03) : 591 - 620
  • [46] Bayesian co-clustering
    Domeniconi, Carlotta
    Laskey, Kathryn
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2015, 7 (05) : 347 - 356
  • [47] Automated Parameter-Less Optical Mark Recognition
    Kumar, N. C. Dayananda
    Suresh, K., V
    Dinesh, R.
    DATA ANALYTICS AND LEARNING, 2019, 43 : 185 - 195
  • [48] Multitask possibilistic and fuzzy co-clustering algorithm for clustering data with multisource features
    Jiaqi Ren
    Youlong Yang
    Neural Computing and Applications, 2020, 32 : 4785 - 4804
  • [49] Parameter-wise co-clustering for high-dimensional data
    Gallaugher, M. P. B.
    Biernacki, C.
    McNicholas, P. D.
    COMPUTATIONAL STATISTICS, 2023, 38 (03) : 1597 - 1619
  • [50] Multitask possibilistic and fuzzy co-clustering algorithm for clustering data with multisource features
    Ren, Jiaqi
    Yang, Youlong
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (09): : 4785 - 4804