Feasibility of Active Machine Learning for Multiclass Compound Classification

被引:30
|
作者
Lang, Tobias [1 ,2 ]
Flachsenberg, Florian [1 ]
von Luxburg, Ulrike [3 ]
Rarey, Matthias [1 ]
机构
[1] Univ Hamburg, Ctr Bioinformat, D-20146 Hamburg, Germany
[2] Univ Hamburg, Dept Comp Sci, Schluterstr 70, D-20146 Hamburg, Germany
[3] Univ Tubingen, Dept Comp Sci, D-72076 Tubingen, Germany
关键词
DISCOVERY; TOOL;
D O I
10.1021/acs.jcim.5b00332
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [21] Liver Cirrhosis Stage Prediction Using Machine Learning: Multiclass Classification
    Sidana, Tejasv Singh
    Singhal, Saransh
    Gupta, Shruti
    Goel, Ruchi
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 3, 2023, 492 : 109 - 129
  • [22] Multiclass Classification of Dry Bean Grains Using Machine Learning Techniques
    Coronel-Reyes, Julian
    Delgado-Vera, Carlota
    Chavez-Urbina, Jenny
    Sinche-Guzman, Andrea
    TECHNOLOGIES AND INNOVATION, CITI 2024, 2025, 2276 : 16 - 27
  • [23] Binary and Multiclass Classification of Histopathological Images Using Machine Learning Techniques
    Wang, Jiatong
    Zhu, Tiantian
    Liang, Shan
    Karthiga, R.
    Narasimhan, K.
    Elamaran, V
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2020, 10 (09) : 2252 - 2258
  • [24] Classification of Firewall Log Data Using Multiclass Machine Learning Models
    Aljabri, Malak
    Alahmadi, Amal A.
    Mohammad, Rami Mustafa A.
    Aboulnour, Menna
    Alomari, Dorieh M.
    Almotiri, Sultan H.
    ELECTRONICS, 2022, 11 (12)
  • [25] Machine Learning Assisted Methodology for Multiclass Classification of Malignant Brain Tumors
    Vidyarthi, Ankit
    Agarwal, Ruchi
    Gupta, Deepak
    Sharma, Rahul
    Draheim, Dirk
    Tiwari, Prayag
    IEEE ACCESS, 2022, 10 : 50624 - 50640
  • [26] Machine Learning Assisted Methodology for Multiclass Classification of Malignant Brain Tumors
    Vidyarthi, Ankit
    Agarwal, Ruchi
    Gupta, Deepak
    Sharma, Rahul
    Draheim, Dirk
    Tiwari, Prayag
    IEEE Access, 2022, 10 : 50624 - 50640
  • [27] A Machine Learning Based Ensemble Method for Automatic Multiclass Classification of Decisions
    Fu, Liming
    Liang, Peng
    Li, Xueying
    Yang, Chen
    PROCEEDINGS OF EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING (EASE 2021), 2021, : 40 - 49
  • [28] LEARNING MULTICLASS CLASSIFICATION PROBLEMS
    WATKIN, TLH
    RAU, A
    BOLLE, D
    VANMOURIK, J
    JOURNAL DE PHYSIQUE I, 1992, 2 (02): : 167 - 180
  • [29] Multiclass Probabilistic Classification Vector Machine
    Lyu, Shengfei
    Tian, Xing
    Li, Yang
    Jiang, Bingbing
    Chen, Huanhuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (10) : 3906 - 3919
  • [30] Probability based voting extreme learning machine for multiclass XML documents classification
    Zhao, Xiangguo
    Bi, Xin
    Qiao, Baiyou
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2014, 17 (05): : 1217 - 1231