Discovering transcriptional modules by Bayesian data integration

被引:50
|
作者
Savage, Richard S. [1 ]
Ghahramani, Zoubin [2 ]
Griffin, Jim E. [3 ]
de la Cruz, Bernard J.
Wild, David L. [1 ]
机构
[1] Univ Warwick, Syst Biol Ctr, Coventry CV4 7AL, W Midlands, England
[2] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[3] Univ Kent, Sch Math Stat & Actuarial Sci, Canterbury, Kent, England
基金
英国工程与自然科学研究理事会;
关键词
GENE-EXPRESSION DATA; MIXTURE MODEL; NONPARAMETRIC PROBLEMS; REGULATORY NETWORKS; DIRICHLET PROCESSES; CLUSTER-ANALYSIS; MICROARRAY DATA; CELL-CYCLE; GENOME; YEAST;
D O I
10.1093/bioinformatics/btq210
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a geneby- gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs.
引用
收藏
页码:i158 / i167
页数:10
相关论文
共 50 条
  • [21] A Systematic Bayesian Integration of Epidemiological and Genetic Data
    Lau, Max S. Y.
    Marion, Glenn
    Streftaris, George
    Gibson, Gavin
    PLOS COMPUTATIONAL BIOLOGY, 2015, 11 (11)
  • [22] Bayesian Hybrid Matrix Factorisation for Data Integration
    Brouwer, Thomas
    Lio, Pietro
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 557 - 566
  • [23] Discovering Transcriptional Regulation by Integrating Protein-Protein Interaction, Gene Expression and Transcriptional Interaction Data
    Luo, Fei
    Li, Jinyan
    OPTIMIZATION AND SYSTEMS BIOLOGY, 2009, 11 : 109 - 116
  • [24] Application of random matrix theory to microarray data for discovering functional gene modules
    Luo, F
    Zhong, JX
    Yang, YF
    Zhou, JZ
    PHYSICAL REVIEW E, 2006, 73 (03):
  • [25] Discovering product counterfeits in online shops: A big data integration challenge
    Rahm, Erhard, 1600, Association for Computing Machinery (05): : 1 - 2
  • [26] A multidimensional Bayesian IRT method for discovering misconceptions from concept test data
    Segado, Martin
    Adair, Aaron
    Stewart, John
    Ma, Yunfei
    Drury, Byron
    Pritchard, David
    FRONTIERS IN PSYCHOLOGY, 2025, 16
  • [27] Combining sequence and time series expression data to learn transcriptional modules
    Kundaje, A
    Middendorf, M
    Gao, F
    Wiggins, C
    Leslie, C
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2005, 2 (03) : 194 - 202
  • [28] Reverse-Engineering Transcriptional Modules from Gene Expression Data
    Michoel, Tom
    De Smet, Riet
    Joshi, Anagha
    Marchal, Kathleen
    Van de Peer, Yves
    CHALLENGES OF SYSTEMS BIOLOGY: COMMUNITY EFFORTS TO HARNESS BIOLOGICAL COMPLEXITY, 2009, 1158 : 36 - 43
  • [29] Bayesian Networks for Data Integration in the Absence of Foreign Keys
    Zhang, Bohan
    Sanner, Scott
    Bouadjenek, Mohamed Reda
    Gupta, Shagun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (04) : 803 - 808
  • [30] Discovering Functional Modules by Topic Modeling RNA-Seq Based Toxicogenomic Data
    Yu, Ke
    Gong, Binsheng
    Lee, Mikyung
    Liu, Zhichao
    Xu, Joshua
    Perkins, Roger
    Tong, Weida
    CHEMICAL RESEARCH IN TOXICOLOGY, 2014, 27 (09) : 1528 - 1536