Gene Classification Using Codon Usage and Support Vector Machines

被引:28
|
作者
Ma, Jianmin [1 ]
Nguyen, Minh N. [1 ]
Rajapakse, Jagath C. [1 ]
机构
[1] Nanyang Technol Univ, Bioinformat Res Ctr, Singapore 637553, Singapore
关键词
Codon usage bias; gene classification; Human Leukocyte Antigen (HLA); Major Histocompatibility Complex (MHC); relative synonymous codon usage (RSCU); Support Vector Machines (SVMs); MAJOR HISTOCOMPATIBILITY COMPLEX; INDEPENDENT COMPONENT ANALYSIS; MULTIPLE-SEQUENCE ALIGNMENT; ESCHERICHIA-COLI; CLUSTER-ANALYSIS; SACCHAROMYCES-CEREVISIAE; CANCER CLASSIFICATION; ARABIDOPSIS-THALIANA; BACILLUS-SUBTILIS; BINDING PEPTIDES;
D O I
10.1109/TCBB.2007.70240
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A novel approach for gene classification, adopting codon usage bias as feature inputs to support vector machines (SVMs) is proposed. The DNA sequence is first converted to a 59-dimensional feature vector, where each element corresponds to the relative synonymous usage (RSCU) frequency of a codon. Since the input to the classifier is independent of sequence length, the approach is especially useful when sequences to be classified are of differing lengths and homology-based methods tend to fail. The method is demonstrated with 1,841 Human Leukocyte Antigen (HLA) sequences, which are classified into two major classes, HLA-I and HLA-II. Each major class is further classified into subgroups. Using codon usage frequencies, binary SVM achieved an accuracy rate of 99.3 percent for HLA major class classification and multiclass SVM achieved accuracy rates of 99.73 percent and 98.38 percent for the subclass classification of HLA-I and HLA-II molecules, respectively. Comparisons with K-Means clustering and other classifiers and homology-based features are given. Results indicate that the classification based on codon usage bias is consistent with biological functions of HLA molecules.
引用
收藏
页码:134 / 143
页数:10
相关论文
共 50 条
  • [21] Classification of Endoscopic Images using Support Vector Machines
    Surangsrirat, Decho
    Tapia, Moiez A.
    Zhao, Weizhao
    IEEE SOUTHEASTCON 2010: ENERGIZING OUR FUTURE, 2010, : 436 - 439
  • [22] Nonstationary signal classification using support vector machines
    Gretton, A
    Davy, M
    Doucet, A
    Rayner, PJW
    2001 IEEE WORKSHOP ON STATISTICAL SIGNAL PROCESSING PROCEEDINGS, 2001, : 305 - 308
  • [23] Online motion classification using support vector machines
    Cao, DW
    Masoud, OT
    Boley, D
    Papanikolopoulos, N
    2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2291 - 2296
  • [24] Classification of Nucleotide Sequences Using Support Vector Machines
    Seo, Tae-Kun
    JOURNAL OF MOLECULAR EVOLUTION, 2010, 71 (04) : 250 - 267
  • [25] Classification of Raman Spectra using Support Vector Machines
    Kyriakides, Alexandros
    Kastanos, Evdokia
    Pitris, Constantinos
    2009 9TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS IN BIOMEDICINE, 2009, : 449 - +
  • [26] Audio signal classification using support vector machines
    Chen, Lei-Ting
    Wang, Ming-Jen
    Wang, Chia-Jiu
    Tai, Heng-Ming
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 2, PROCEEDINGS, 2006, 3972 : 188 - 193
  • [27] Musical genre classification using support vector machines
    Xu, CS
    Maddage, NC
    Shao, X
    Cao, F
    Tian, Q
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 429 - 432
  • [28] Robust classification and regression using support vector machines
    Trafalis, Theodore B.
    Gilbert, Robin C.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 173 (03) : 893 - 909
  • [29] Classification of Nucleotide Sequences Using Support Vector Machines
    Tae-Kun Seo
    Journal of Molecular Evolution, 2010, 71 : 250 - 267
  • [30] Classification of the Thyroid Nodules Using Support Vector Machines
    Chang, Chuan-Yu
    Tsai, Ming-Feng
    Chen, Shao-Jer
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3093 - +