iBBiG: iterative binary bi-clustering of gene sets

被引:35
|
作者
Gusenleitner, Daniel [1 ]
Howe, Eleanor A. [1 ,2 ]
Bentink, Stefan [1 ,3 ]
Quackenbush, John [1 ,3 ,4 ]
Culhane, Aedin C. [1 ,3 ]
机构
[1] Dana Farber Canc Inst, Dept Biostat & Computat Biol, Boston, MA 02115 USA
[2] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
[3] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[4] Dana Farber Canc Inst, Dept Canc Biol, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
ENRICHMENT ANALYSIS; BIOLOGICAL PROCESSES; MICROARRAY DATA; EXPRESSION DATA; DISEASES; CCL5;
D O I
10.1093/bioinformatics/bts438
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Meta-analysis of genomics data seeks to identify genes associated with a biological phenotype across multiple datasets; however, merging data from different platforms by their features (genes) is challenging. Meta-analysis using functionally or biologically characterized gene sets simplifies data integration is biologically intuitive and is seen as having great potential, but is an emerging field with few established statistical methods. Results: We transform gene expression profiles into binary gene set profiles by discretizing results of gene set enrichment analyses and apply a new iterative bi-clustering algorithm (iBBiG) to identify groups of gene sets that are coordinately associated with groups of phenotypes across multiple studies. iBBiG is optimized for meta-analysis of large numbers of diverse genomics data that may have unmatched samples. It does not require prior knowledge of the number or size of clusters. When applied to simulated data, it outperforms commonly used clustering methods, discovers overlapping clusters of diverse sizes and is robust in the presence of noise. We apply it to meta-analysis of breast cancer studies, where iBBiG extracted novel gene set-phenotype association that predicted tumor metastases within tumor subtypes.
引用
收藏
页码:2484 / 2492
页数:9
相关论文
共 50 条
  • [31] Bi-clustering continuous data with self-organizing map
    Khalid Benabdeslem
    Kais Allab
    Neural Computing and Applications, 2013, 22 : 1551 - 1562
  • [32] A Bi-Clustering Agent-based Approach for Map Segmentation
    Bouchachia, Abdelhamid
    Prossegger, Markus
    IA 2009: IEEE SYMPOSIUM ON INTELLIGENT AGENTS, 2009, : 99 - +
  • [33] Bi-clustering of Gene Expression Microarray using Coarse grained Parallel Genetic Algorithm(CgPGA) with Migration
    Laishram, Ayangleima
    Vipsita, Swati
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [34] Nonparametric Bayesian Bi-Clustering for Next Generation Sequencing Count Data
    Xu, Yanxun
    Lee, Juhee
    Yuan, Yuan
    Mitra, Riten
    Liang, Shoudan
    Mueller, Peter
    Ji, Yuan
    BAYESIAN ANALYSIS, 2013, 8 (04): : 759 - 780
  • [35] Simultaneous Parameter Learning and Bi-clustering for Multi-Response Models
    Yu, Ming
    Ramamurthy, Karthikeyan Natesan
    Thompson, Addie
    Lozano, Aurelie C.
    FRONTIERS IN BIG DATA, 2019, 2
  • [36] Prediction on recommender system based on bi-clustering and moth flame optimization
    Wu, Huan-huan
    Ke, Gang
    Wang, Yang
    Chang, Yu-Teng
    APPLIED SOFT COMPUTING, 2022, 120
  • [37] An approximation polynomial-time algorithm for a sequence bi-clustering problem
    A. V. Kel’manov
    S. A. Khamidullin
    Computational Mathematics and Mathematical Physics, 2015, 55 : 1068 - 1076
  • [38] A Knowledge-Driven Bi-clustering Method for Mining Noisy Datasets
    Mouhoubi, Karima
    Letocart, Lucas
    Rouveirol, Celine
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT III, 2012, 7665 : 585 - 593
  • [39] Predictive Bi-clustering Trees for Hierarchical Multi-label Classification
    Santos, Bruna Z.
    Nakano, Felipe K.
    Cerri, Ricardo
    Vens, Celine
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT III, 2021, 12459 : 701 - 718
  • [40] Event-Oriented Keyphrase Extraction Based on Bi-clustering Model
    Zhao, Lin
    Zang, Liangjun
    Huang, Longtao
    Han, Jizhong
    Hu, Songlin
    COMPUTATIONAL SCIENCE - ICCS 2019, PT V, 2019, 11540 : 207 - 220