Optimal linear ensemble of binary classifiers

被引:1
|
作者
Ahsen, Mehmet Eren [1 ,2 ,5 ]
Vogel, Robert [3 ,4 ]
Stolovitzky, Gustavo [3 ]
机构
[1] Univ Illinois, Dept Business Adm, 1206 S Sixth St, Champaign, IL 61820 USA
[2] Univ Illinois, Dept Biomed & Translat Sci, Urbana, IL 61801 USA
[3] IBM Corp, Thomas J Watson Res Ctr, New York, NY 10598 USA
[4] Scripps Res, Dept Integrated Struct & Computat Biol, La Jolla, CA 92037 USA
[5] Univ Illinois, Dept Biomed & Translat Sci, 1206 S Sixth St, Champaign, IL 61820 USA
来源
BIOINFORMATICS ADVANCES | 2024年 / 4卷 / 01期
关键词
PREDICTION; CHALLENGE; AREA; CARE;
D O I
10.1093/bioadv/vbae093
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivation The integration of vast, complex biological data with computational models offers profound insights and predictive accuracy. Yet, such models face challenges: poor generalization and limited labeled data.Results To overcome these difficulties in binary classification tasks, we developed the Method for Optimal Classification by Aggregation (MOCA) algorithm, which addresses the problem of generalization by virtue of being an ensemble learning method and can be used in problems with limited or no labeled data. We developed both an unsupervised (uMOCA) and a supervised (sMOCA) variant of MOCA. For uMOCA, we show how to infer the MOCA weights in an unsupervised way, which are optimal under the assumption of class-conditioned independent classifier predictions. When it is possible to use labels, sMOCA uses empirically computed MOCA weights. We demonstrate the performance of uMOCA and sMOCA using simulated data as well as actual data previously used in Dialogue on Reverse Engineering and Methods (DREAM) challenges. We also propose an application of sMOCA for transfer learning where we use pre-trained computational models from a domain where labeled data are abundant and apply them to a different domain with less abundant labeled data.Availability and implementation GitHub repository, https://github.com/robert-vogel/moca.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] On ECOC as binary ensemble classifiers
    Ko, J
    Kim, E
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2005, 3587 : 1 - 10
  • [2] An optimal reject rule for binary classifiers
    Tortorella, F
    ADVANCES IN PATTERN RECOGNITION, 2000, 1876 : 611 - 620
  • [3] Optimal selection of ensemble classifiers using measures of competence and diversity of base classifiers
    Lysiak, Rafal
    Kurzynski, Marek
    Woloszynski, Tomasz
    NEUROCOMPUTING, 2014, 126 : 29 - 35
  • [4] Optimal Membership Function for Creating Fuzzy Classifiers Ensemble
    Hassanzadeh, M.
    Ardeshir, G.
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2014, 12 (01): : 73 - 84
  • [5] Detection of binary signals using linear classifiers
    Lee, C
    Kim, H
    Baek, B
    Proceedings of the Sixth IASTED International Conference on Signal and Image Processing, 2004, : 488 - 492
  • [6] A New One Class Classifier Based on Ensemble of Binary Classifiers
    Habibi Aghdam, Hamed
    Jahani Heravi, Elnaz
    Puig, Domenec
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT II, 2015, 9257 : 242 - 253
  • [7] ENSEMBLE CLASSIFICATION BASED ON RANDOM LINEAR BASE CLASSIFIERS
    Xiao, Qi
    Wang, Zhengdao
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2706 - 2710
  • [8] OPTIMAL TRAINING OF THRESHOLDED LINEAR CORRELATION CLASSIFIERS
    HILDEBRANDT, TH
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (06): : 577 - 588
  • [9] Speciated GA for optimal ensemble classifiers in DNA microarray classification
    Cho, SB
    Park, C
    CEC2004: PROCEEDINGS OF THE 2004 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2004, : 590 - 597
  • [10] Efficient optimal linear boosting of a pair of classifiers
    Boyarshinov, Victor
    Magdon-Ismail, Malik
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (02): : 317 - 328