The em algorithm for kernel matrix completion with auxiliary data

被引:21
|
作者
Tsuda, K
Akaho, S
Asai, K
机构
[1] Max Planck Inst Biol Cybernet, D-72076 Tubingen, Germany
[2] AIST Computat Biol Res Ctr, Tokyo 1350064, Japan
[3] AIST Neurosci Res Inst, Tsukuba, Ibaraki 3058568, Japan
[4] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol, Kashiwa, Chiba 2778562, Japan
关键词
information geometry; em algorithm; kernel matrix completion; bacteria clustering;
D O I
10.1162/153244304322765649
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In biological data, it is often the case that observed data are available only for a subset of samples. When a kernel matrix is derived from such data, we have to leave the entries for unavailable samples as missing. In this paper, the missing entries are completed by exploiting an auxiliary kernel matrix derived from another information source. The parametric model of kernel matrices is created as a set of spectral variants of the auxiliary kernel matrix, and the missing entries are estimated by fitting this model to the existing entries. For model fitting, we adopt the em algorithm (distinguished from the EM algorithm of Dempster et al., 1977) based on the information geometry of positive definite matrices. We will report promising results on bacteria clustering experiments using two marker sequences: 16S and gyrB.
引用
收藏
页码:67 / 81
页数:15
相关论文
共 50 条
  • [21] Pairwise clustering with matrix factorisation and the EM algorithm
    Robles-Kelly, A
    Hancock, ER
    COMPUTER VISION - ECCV 2002, PT II, 2002, 2351 : 63 - 77
  • [22] EM Algorithm State Matrix Estimation for Navigation
    Einicke, Garry A.
    Falco, Gianluca
    Malos, John T.
    IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (05) : 437 - 440
  • [23] Kernel Matrix Completion for Learning Nearly Consensus Support Vector Machines
    Lee, Sangkyun
    Poelitz, Christian
    PATTERN RECOGNITION APPLICATIONS AND METHODS, ICPRAM 2014, 2015, 9443 : 93 - 109
  • [24] Clustering of spatial data by the EM algorithm
    Ambroise, C
    Dang, M
    Govaert, G
    GEOENV I - GEOSTATISTICS FOR ENVIRONMENTAL APPLICATIONS, 1997, 9 : 493 - 504
  • [25] A classification EM algorithm for binned data
    Same, Allou
    Ambroise, Christophe
    Govaert, Gerard
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (02) : 466 - 480
  • [26] Spatial imputation for air pollutants data sets via low rank matrix completion algorithm
    Liu, Xiaofeng
    Wang, Xue
    Zou, Lang
    Xia, Jing
    Pang, Wei
    ENVIRONMENT INTERNATIONAL, 2020, 139
  • [27] Data Reconstructing Algorithm in Unreliable Links Based on Matrix Completion for Heterogeneous Wireless Sensor Networks
    Zhai, Shuang
    Qian, Zhihong
    Yang, Bingtao
    Wang, Xue
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (13)
  • [28] A Primal-Dual Algorithm for Data Gathering Based on Matrix Completion for Wireless Sensor Networks
    Moussa, Mohamed-Ali
    Marnissi, Yosra
    Ghamri-Doudane, Yacine
    2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2016,
  • [29] A DCT Regularized Matrix Completion Algorithm for Energy Efficient Data Gathering in Wireless Sensor Networks
    Yi, Kefu
    Wan, Jiangwen
    Bao, Tianyue
    Yao, Lei
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [30] SeqBMC: Single-cell data processing using iterative block matrix completion algorithm based on matrix factorisation
    Gong, Lejun
    Yu, Like
    Wei, Xinyi
    Zhou, Shehai
    Xu, Shuhua
    IET SYSTEMS BIOLOGY, 2025, 19 (01)