Knowledge-based analysis of microarray gene expression data by using support vector machines

被引:1495
|
作者
Brown, MPS
Grundy, WN
Lin, D
Cristianini, N
Sugnet, CW
Furey, TS
Ares, M
Haussler, D
机构
[1] Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USA
[2] Univ Calif Santa Cruz, Ctr Mol Biol RNA, Dept Biol, Santa Cruz, CA 95064 USA
[3] Columbia Univ, Dept Comp Sci, New York, NY 10025 USA
[4] Univ Bristol, Dept Engn Math, Bristol BS8 1TR, Avon, England
关键词
D O I
10.1073/pnas.97.1.262
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
引用
收藏
页码:262 / 267
页数:6
相关论文
共 50 条
  • [1] Gene expression data analysis using support vector machines
    Chu, F
    Wang, LP
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2268 - 2271
  • [2] Transductive Support Vector Machines for classification of microarray gene expression data
    Semolini, R
    Von Zuben, FJ
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2946 - 2951
  • [3] Online Knowledge-Based Support Vector Machines
    Kunapuli, Gautam
    Bennett, Kristin P.
    Shabbeer, Amina
    Maclin, Richard
    Shavlik, Jude
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II: EUROPEAN CONFERENCE, ECML PKDD 2010, 2010, 6322 : 145 - 161
  • [4] On Knowledge-Based Gene Expression Data Analysis
    Arakelyan, Arsen
    Aslanyan, Levon
    Boyajyan, Anna
    2013 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES (CSIT), 2013,
  • [5] Extraction of the cancer information from microarray of gene expression using Support Vector Machines
    Wilinski, A
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS IV, 2006, 6159
  • [6] A note on classification of gene expression data using support vector machines
    Fujarewicz, K
    Kimmel, M
    Rzeszowska-Wolny, J
    Swierniak, A
    JOURNAL OF BIOLOGICAL SYSTEMS, 2003, 11 (01) : 43 - 56
  • [7] Bagged ensembles of Support Vector Machines for gene expression data analysis
    Valentini, G
    Muselli, M
    Ruffino, F
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1844 - 1849
  • [8] Classification of Dengue Fever Patients Based on Gene Expression Data Using Support Vector Machines
    Gomes, Ana Lisa V.
    Wee, Lawrence J. K.
    Khan, Asif M.
    Gil, Laura H. V. G.
    Marques, Ernesto T. A., Jr.
    Calzavara-Silva, Carlos E.
    Tan, Tin Wee
    PLOS ONE, 2010, 5 (06):
  • [9] Comparison of support vector machines to other classifiers using gene expression data
    Shieh, GS
    Jiang, YC
    Shih, YS
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2006, 35 (01) : 241 - 256
  • [10] Framework for knowledge-based integrative analysis of microarray data
    Shi, Jiantao
    Wang, Kankan
    Zhang, Ji
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 56 - +