Integrated analysis of gene expression by association rules discovery

被引:68
作者
Carmona-Saez, P
Chagoyen, M
Rodriguez, A
Trelles, O
Carazo, JM
Pascual-Montano, A [1 ]
机构
[1] Univ Complutense Madrid, Fac CC Fis, Comp Architecture & Syst Engn Dept, E-28040 Madrid, Spain
[2] CSIC, CNB, Natl Biotechnol Ctr, BioComp Unit, E-28049 Madrid, Spain
[3] Univ Malaga, Comp Architecture Dept, Malaga 29080, Spain
关键词
D O I
10.1186/1471-2105-7-54
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Microarray technology is generating huge amounts of data about the expression level of thousands of genes, or even whole genomes, across different experimental conditions. To extract biological knowledge, and to fully understand such datasets, it is essential to include external biological information about genes and gene products to the analysis of expression data. However, most of the current approaches to analyze microarray datasets are mainly focused on the analysis of experimental data, and external biological information is incorporated as a posterior process. Results: In this study we present a method for the integrative analysis of microarray data based on the Association Rules Discovery data mining technique. The approach integrates gene annotations and expression data to discover intrinsic associations among both data sources based on co-occurrence patterns. We applied the proposed methodology to the analysis of gene expression datasets in which genes were annotated with metabolic pathways, transcriptional regulators and Gene Ontology categories. Automatically extracted associations revealed significant relationships among these gene attributes and expression patterns, where many of them are clearly supported by recently reported work. Conclusion: The integration of external biological information and gene expression data can provide insights about the biological processes associated to gene expression programs. In this paper we show that the proposed methodology is able to integrate multiple gene annotations and expression data in the same analytic framework and extract meaningful associations among heterogeneous sources of data. An implementation of the method is included in the Engene software package.
引用
收藏
页数:16
相关论文
共 50 条
[1]   Systematic management and analysis of yeast gene expression data [J].
Aach, J ;
Rindone, W ;
Church, GM .
GENOME RESEARCH, 2000, 10 (04) :431-445
[2]   Parallel mining of association rules [J].
Agrawal, R ;
Shafer, JC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) :962-969
[3]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[4]   Whole-genome expression analysis: challenges beyond clustering [J].
Altman, RB ;
Raychaudhuri, S .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2001, 11 (03) :340-347
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]   Computational discovery of gene modules and regulatory networks [J].
Bar-Joseph, Z ;
Gerber, GK ;
Lee, TI ;
Rinaldi, NJ ;
Yoo, JY ;
Robert, F ;
Gordon, DB ;
Fraenkel, E ;
Jaakkola, TS ;
Young, RA ;
Gifford, DK .
NATURE BIOTECHNOLOGY, 2003, 21 (11) :1337-1342
[7]  
Becquet C, 2002, GENOME BIOL, V3
[8]  
BORGELT C., 2003, P IEEE ICDM WORKSH F
[9]   GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes [J].
Boyle, EI ;
Weng, SA ;
Gollub, J ;
Jin, H ;
Botstein, D ;
Cherry, JM ;
Sherlock, G .
BIOINFORMATICS, 2004, 20 (18) :3710-3715
[10]   Graph-based iterative Group Analysis enhances microarray interpretation [J].
Breitling, R ;
Amtmann, A ;
Herzyk, P .
BMC BIOINFORMATICS, 2004, 5 (1)