Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

被引:19
|
作者
Iqbal, Muhammad Javed [1 ]
Faye, Ibrahima [2 ]
Samir, Brahim Belhaouari [3 ]
Said, Abas Md [1 ]
机构
[1] Univ Teknol PETRONAS, Dept Comp & Informat Sci, Tronoh 31750, Perak, Malaysia
[2] Univ Teknol PETRONAS, Fundamental & Appl Sci Dept, Tronoh 31750, Perak, Malaysia
[3] Alfaisal Univ, Coll Sci, Riyadh 11533, Saudi Arabia
来源
关键词
FEATURE-EXTRACTION;
D O I
10.1155/2014/173869
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Efficient Feature Selection and Classification for Vehicle Detection
    Wen, Xuezhi
    Shao, Ling
    Fang, Wei
    Xue, Yu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (03) : 508 - 517
  • [22] An Efficient Statistical Feature Selection Based Classification
    Narayanamma, K. Laxmi
    Krishnaiah, R., V
    Sammulal, P.
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, 14 (04): : 27 - 40
  • [23] An Efficient Feature Selection for SAR Target Classification
    Amrani, Moussa
    Yang, Kai
    Zhao, Dongyang
    Fan, Xiaopeng
    Jiang, Feng
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 68 - 78
  • [24] An Efficient Feature Selection Method for Activity Classification
    Zhang, Shumei
    McCullagh, Paul
    Callaghan, Vic
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE), 2014, : 16 - 22
  • [25] Efficient use of unlabeled data for protein sequence classification: a comparative study
    Pavel Kuksa
    Pai-Hsi Huang
    Vladimir Pavlovic
    BMC Bioinformatics, 10
  • [26] Efficient use of unlabeled data for protein sequence classification: a comparative study
    Kuksa, Pavel
    Huang, Pai-Hsi
    Pavlovic, Vladimir
    BMC BIOINFORMATICS, 2009, 10
  • [28] An Efficient Framework for Heart Disease Classification using Feature Extraction and Feature Selection Technique in Data Mining
    Kavitha, R.
    Kannan, E.
    FIRST INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING, TECHNOLOGY AND SCIENCE - ICETETS 2016, 2016,
  • [29] Texture feature extraction and selection for classification of images in a sequence
    Win, K
    Baik, S
    Baik, R
    Ahn, S
    Kim, S
    Jo, Y
    COMBINATORIAL IMAGE ANALYSIS, PROCEEDINGS, 2004, 3322 : 750 - 757
  • [30] Protein sequence classification using feature hashing
    Caragea, Cornelia
    Silvescu, Adrian
    Mitra, Prasenjit
    PROTEOME SCIENCE, 2012, 10