Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites

被引:0
|
作者
Zhen Chen [1 ]
Ningning He [1 ]
Yu Huang [2 ]
Wen Tao Qin [3 ]
Xuhan Liu [4 ]
Lei Li [1 ,2 ,5 ]
机构
[1] School of Basic Medicine,Qingdao University
[2] School of Data Science and Software Engineering,Qingdao University
[3] Department of Biochemistry,Schulich School of Medicine and Dentistry,University of Western Ontario
[4] Department of Information Technology,Beijing Oriental Yamei Gene Technology Institute Co.Ltd.
[5] Qingdao Cancer Institute,Qingdao University
基金
中国国家自然科学基金;
关键词
Deep learning; Recurrent neural network; LSTM; Malonylation; Random forest;
D O I
暂无
中图分类号
Q811.4 [生物信息论]; TP18 [人工智能理论];
学科分类号
0711 ; 081104 ; 0812 ; 0831 ; 0835 ; 1405 ;
摘要
As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning(DL) network classifier based on long short-term memory(LSTM) with word embedding(LSTMWE) for the prediction of mammalian malonylation sites.LSTMWEperforms better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTMWE is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning(ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTMWEand the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence.LEMP is available at http://www.bioinfogo.org/lemp.
引用
收藏
页码:451 / 459
页数:9
相关论文
共 50 条
  • [21] Predicting vitamin D deficiency using optimized random forest classifier
    Alloubani, Aladeen
    Abuhaija, Belal
    Almatari, M.
    Jaradat, Ghaith
    Ihnaini, Baha
    CLINICAL NUTRITION ESPEN, 2024, 60 : 1 - 10
  • [22] AN ENSEMBLE OF OPTIMAL DEEP LEARNING ARCHITECTURE WITH RANDOM FOREST CLASSIFIER FOR CONTENT BASED IMAGE RETRIEVAL SYSTEM
    Anandababu, Purushothaman
    Kamarasan, Mari
    IIOAB JOURNAL, 2020, 11 (02) : 55 - 63
  • [23] AN INCREMENTAL EXTREMELY RANDOM FOREST CLASSIFIER FOR ONLINE LEARNING AND TRACKING
    Wang, Aiping
    Wan, Guowei
    Cheng, Zhiquan
    Li, Sikun
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1449 - 1452
  • [24] An Ensemble classifier approach for Disease Diagnosis using Random Forest
    Pachange, Sarika
    Joglekar, Bela
    Kulkarni, Parag
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [25] Improving the explainability of Random Forest classifier - user centered approach
    Petkovic, Dragutin
    Altman, Russ
    Wong, Mike
    Vigil, Arthur
    PACIFIC SYMPOSIUM ON BIOCOMPUTING 2018 (PSB), 2018, : 204 - 215
  • [26] A Random Forest Model for Predicting Allosteric and Functional Sites on Proteins
    Chen, Ava S-Y.
    Westwood, Nicholas J.
    Brear, Paul
    Rogers, Graeme W.
    Mavridis, Lazaros
    Mitchell, John B. O.
    MOLECULAR INFORMATICS, 2016, 35 (3-4) : 125 - 135
  • [27] PREDICTING THE OUTCOME OF THE CHESS GAME REPRESENTED AS A COMPLEX NETWORK WITH RANDOM FOREST CLASSIFIER
    Jokic, Jovan
    Martincic-Ipsic, Sanda
    ZBORNIK VELEUCILISTA U RIJECI-JOURNAL OF THE POLYTECHNICS OF RIJEKA, 2019, 7 (01): : 31 - 52
  • [28] Machine Learning Approach for Malware Detection Using Random Forest Classifier on Process List Data Structure
    Joshi, Santosh
    Upadhyay, Himanshu
    Lagos, Leonel
    Akkipeddi, Naga Suryamitra
    Guerra, Valerie
    2ND INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND DATA MINING (ICISDM 2018), 2018, : 98 - 102
  • [29] Faults detection and diagnosis of PV systems based on machine learning approach using random forest classifier
    Amiri, Ahmed Faris
    Oudira, Houcine
    Chouder, Aissa
    Kichou, Sofiane
    ENERGY CONVERSION AND MANAGEMENT, 2024, 301
  • [30] A random forest approach for predicting coal spontaneous combustion
    Lei, Changkui
    Deng, Jun
    Cao, Kai
    Ma, Li
    Xiao, Yang
    Ren, Lifeng
    FUEL, 2018, 223 : 63 - 73