Cluster Tree based Multi-Label Classification for Protein Function Prediction

被引:0
|
作者
Wu, Qingyao [1 ,2 ]
Ye, Yunming [1 ,2 ]
Zhang, Xiaofeng [1 ,2 ]
Ho, Shen-Shyang [3 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Dept Comp Sci, Shenzhen, Peoples R China
[2] Shenzhen Key Lab Internet Informat Collaboration, Shenzhen, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
关键词
Data mining; Multi-label data; Multi-label classification; Protein function prediction;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automatically assigning functions for unknown proteins is a key task in computational biology. Proteins in nature have multiple classes according to the functions they perform. Many efforts have been made to cast the protein function prediction into a multi-label learning problem. This paper proposes a novel Cluster Tree based Multi-label Learning algorithm (CTML) for protein function prediction. The main idea is to compute a set of predictive labels associated at each node for multi-label prediction by using the k-means clustering techniques and the predictive functions via the learning data at the nodes. With the propagation of the predictive labels from the root node to the leaf node, the correlations between labels can be preserved. Experimental results on benchmark data (genbase and yeast datasets) show that the proposed CTML algorithm is effective in predicting protein functions. Moreover, the classification performance of the CTML algorithm is competitive against the other baseline multi-label learning algorithms.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Hierarchical multi-label classification based on over-sampling and hierarchy constraint for gene function prediction
    Chen, Benhui
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2012, 7 (02) : 183 - 189
  • [32] Function-Function Correlated Multi-label Protein Function Prediction over Interaction Networks
    Wang, Hua
    Huang, Heng
    Ding, Chris
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (04) : 322 - 343
  • [33] Co-training based prediction of multi-label protein–protein interactions
    Tang T.
    Zhang X.
    Li W.
    Wang Q.
    Liu Y.
    Cao X.
    Computers in Biology and Medicine, 2024, 177
  • [34] Multi-label Classification for Intelligent Health Risk Prediction
    Li, Runzhi
    Zhao, Hongling
    Lin, Yusong
    Maxwell, Andrew
    Zhang, Chaoyang
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 986 - 993
  • [35] Air pollution prediction via multi-label classification
    Corani, Giorgio
    Scanagatta, Mauro
    ENVIRONMENTAL MODELLING & SOFTWARE, 2016, 80 : 259 - 264
  • [36] Multi-label classification with XGBoost for metabolic pathway prediction
    Joe, Hyunwhan
    Kim, Hong-Gee
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [37] Evaluating the Prediction Bias Induced by Label Imbalance in Multi-label Classification
    Piras, Luca
    Boratto, Ludovico
    Ramos, Guilherme
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3368 - 3372
  • [38] A multi-label classification based approach for sentiment classification
    Liu, Shuhua Monica
    Chen, Jiun-Hung
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (03) : 1083 - 1093
  • [39] Prominent Label Identification and Multi-label Classification for Cancer Prognosis Prediction
    Saleema, J. S.
    Sairam, B.
    Naveen, S. D.
    Yuvaraj, K.
    Shenoy, P. Deepa
    Venugopal, K. R.
    Patnaik, L. M.
    TENCON 2012 - 2012 IEEE REGION 10 CONFERENCE: SUSTAINABLE DEVELOPMENT THROUGH HUMANITARIAN TECHNOLOGY, 2012,
  • [40] Multi-label classification for tree and directed acyclic graphs hierarchies
    Ramírez-Corona Mallinali, Mallinali
    Sucar, L. Enrique
    Morales, Eduardo F.
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8754 : 409 - 425