The impact of feature selection on one and two-class classification performance for plant microRNAs

被引:12
|
作者
Khalifa, Waleed [1 ,2 ]
Yousef, Malik [1 ,2 ]
Demirci, Muserref Duygu Sacar [3 ]
Allmer, Jens [3 ,4 ]
机构
[1] Coll Sakhnin, Comp Sci, Sakhnin, Israel
[2] Galilee Soc, Inst Appl Res, Shefa Amr, Israel
[3] Izmir Inst Technol, Mol Biol & Genet, Izmir, Turkey
[4] Bionia Inc, IZTEKGEB, Izmir, Turkey
来源
PEERJ | 2016年 / 4卷
关键词
MicroRNA; Machine learning; Feature selection; Plant; One-class classification; Two-class classification; PREDICTION; SVM; MIRBASE;
D O I
10.7717/peerj.2135
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long, mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being similar to 29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is similar to 13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.
引用
收藏
页数:13
相关论文
共 50 条
  • [11] Filter Feature Selection for One-Class Classification
    Luiz H N Lorena
    André C P L F Carvalho
    Ana C Lorena
    Journal of Intelligent & Robotic Systems, 2015, 80 : 227 - 243
  • [12] Filter Feature Selection for One-Class Classification
    Lorena, Luiz H. N.
    Carvalho, Andre C. P. L. F.
    Lorena, Ana C.
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2015, 80 : S227 - S243
  • [13] Fusing one-class and two-class classification - A case study on the detection of pepper fraud
    Alewijn, Martin
    Akridopoulou, Vasiliki
    Venderink, Tjerk
    Muller-Maatsch, Judith
    Silletti, Erika
    FOOD CONTROL, 2023, 145
  • [14] Using two-class classifiers for multiclass classification
    Tax, DMJ
    Duin, RPW
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2002, : 124 - 127
  • [15] Empirical study on two-class image classification
    Kumari, Smriti
    Saharia, Navanath
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 583 - 587
  • [16] A variable selection method for multiclass classification problems using two-class ROC analysis
    de Figueiredo, Miguel
    Cordella, Christophe B. Y.
    Bouveresse, Delphine Jouan-Rimbaud
    Archer, Xavier
    Begue, Jean-Marc
    Rutledge, Douglas N.
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 177 : 35 - 46
  • [17] Categorizing the feature space for two-class imbalance learning
    Sicilia, Rosa
    Cordelli, Ermanno
    Soda, Paolo
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6181 - 6188
  • [18] A bias correction function for classification performance assessment in two-class imbalanced problems
    Garcia, Vicente
    Mollineda, Ramon A.
    Salvador Sanchez, J.
    KNOWLEDGE-BASED SYSTEMS, 2014, 59 : 66 - 74
  • [19] An evaluation of one-class and two-class classification algorithms for keystroke dynamics authentication on mobile devices
    Antal, Margit
    Szabo, Laszlo Zsolt
    2015 20TH INTERNATIONAL CONFERENCE ON CONTROL SYSTEMS AND COMPUTER SCIENCE, 2015, : 343 - 350
  • [20] Elastic-Net Prefiltering for Two-Class Classification
    Hong, Xia
    Chen, Sheng
    Harris, Chris J.
    IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (01) : 286 - 295