A Binary Classifier for the Prediction of EC Numbers of Enzymes

被引:35
|
作者
Cui, Hao [1 ]
Chen, Lei [1 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
基金
上海市自然科学基金;
关键词
Enzyme; EC number; support vector machine; protein-protein interaction; Weka; binary classification; five-fold cross-validation; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINE; PROTEIN-STRUCTURE; FAMILY CLASSES; GENES; ALGORITHM;
D O I
10.2174/1570164616666190126103036
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identification of Enzyme Commission (EC) number of enzymes is quite important for understanding the metabolic processes that produce enough energy to sustain life. Previous studies mainly focused on predicting six main functional classes or sub-functional classes, i.e., the first two digits of the EC number. Objective: In this study, a binary classifier was proposed to identify the full EC number (four digits) of enzymes. Methods: Enzymes and their known EC numbers were paired as positive samples and negative samples were randomly produced that were as many as positive samples. The associations between any two samples were evaluated by integrating the linkages between enzymes and EC numbers. The classic machining learning algorithm, Support Vector Machine (SVM), was adopted as the prediction engine. Results: The five-fold cross-validation test on five datasets indicated that the overall accuracy, Matthews correlation coefficient and Fl-measure were about 0.786, 0.576 and 0.771, respectively, suggesting the utility of the proposed classifier. In addition, the effectiveness of the classifier was elaborated by comparing it with other classifiers that were based on other classic machine learning algorithms. Conclusion: The proposed classifier was quite effective for prediction of EC number of enzymes and was specially designed for dealing with the problem addressed in this study by testing it on five datasets containing randomly produced samples.
引用
收藏
页码:383 / 391
页数:9
相关论文
共 50 条
  • [1] A Binary Classifier for Prediction of the Types of Metabolic Pathway of Chemicals
    Fang, Yemin
    Chen, Lei
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2017, 20 (02) : 140 - 146
  • [2] Heart Disease Prediction Using Novel Quine McCluskey Binary Classifier (QMBC)
    Kapila, Ramdas
    Ragunathan, Thirumalaisamy
    Saleti, Sumalatha
    Lakshmi, T. Jaya
    Ahmad, Mohd Wazih
    IEEE ACCESS, 2023, 11 : 64324 - 64347
  • [3] Automatic Assignment of EC Numbers
    Egelhofer, Volker
    Schomburg, Ida
    Schomburg, Dietmar
    PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (01)
  • [4] SELECTION OF PREDICTION MODELS BASED ON POSSIBILITY FUNCTION OF BINARY CONNECTION NUMBERS
    Zhou, Huali
    Chen, Huayou
    ADVANCES AND APPLICATIONS IN STATISTICS, 2024, 91 (03) : 371 - 392
  • [5] BINARY NUMBERS
    MIESZKIS, KW
    CHEMISTRY IN BRITAIN, 1968, 4 (06) : 278 - &
  • [6] I&EC 71-Diffusion prediction model of enzymes on IMAC chromatography
    Gutierrez, Ruth
    Martin del Valle, Eva M.
    Galan, Miguel A.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2008, 235
  • [7] Sloping Binary Numbers: A New Sequence Related to the Binary Numbers
    Applegate, David
    Cloitre, Benoit
    Deleham, Philippe
    Sloane, N. J. A.
    JOURNAL OF INTEGER SEQUENCES, 2005, 8 (03)
  • [8] Efficient Binary Classifier for Prediction of Diabetes Using Data Preprocessing and Support Vector Machine
    Pradhan, Madhavi
    Bamnote, G. R.
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2014, VOL 1, 2015, 327 : 131 - 140
  • [9] Enzymes and their turnover numbers
    Smejkal, Gary B.
    Kakumanu, Srikanth
    EXPERT REVIEW OF PROTEOMICS, 2019, 16 (07) : 543 - 544
  • [10] On the optimal binary classifier with an application
    Lopez-Diaz, Maria Concepcion
    Lopez-Diaz, Miguel
    Martinez-Fernandez, Sergio
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 181