A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach

被引:5
|
作者
Cai, Yudong [1 ,2 ]
He, ZhiSong [3 ]
Shi, Xiaohe [4 ,5 ]
Kong, Xiangying [4 ,5 ,6 ]
Gu, Lei [7 ]
Xie, Lu [8 ]
机构
[1] Shanghai Univ, Inst Syst Biol, Shanghai 200244, Peoples R China
[2] Fudan Univ, Ctr Computat Syst Biol, Shanghai 200433, Peoples R China
[3] Zhejiang Univ, Dept Bioinformat, Coll Life Sci, Hangzhou 310058, Zhejiang, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Hlth Sci, Beijing 100864, Peoples R China
[5] Shanghai Jiao Tong Univ, Sch Med, Shanghai, Peoples R China
[6] Shanghai Jiao Tong Univ, Ruijin Hosp, State Key Lab Med Genom, Shanghai 200025, Peoples R China
[7] Fraunhofer Inst Algorithms & Sci Comp, Dept Bioinformat, Aachen, Germany
[8] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
关键词
bioinformatics; data mining; machine learning; mRMR; protein-DNA interaction; SITES; INFORMATION; IDENTIFICATION; RECOGNITION; MODELS; MOTIFS; DOMAIN; P53;
D O I
10.1007/s10059-010-0093-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein-DNA interactions play an essential role in transcriptional regulation, DNA repair, and many vital biological processes. The mechanism of protein-DNA binding, however, remains unclear. For the study of many diseases, researchers must improve their understanding of the amino acid motifs that recognize DNA. Because identifying these motifs experimentally is expensive and time-consuming, it is necessary to devise an approach for computational prediction. Some in silico methods have been developed, but there are still considerable limitations. In this study, we used a machine learning approach to develop a new sequence-based method of predicting protein-DNA binding residues. To make these predictions, we used the properties of the micro-environment of each amino acid from the AAIndex as well as conservation scores. Testing by the cross-validation method, we obtained an overall accuracy of 94.89%. Our method shows that the amino acid micro-environment is important for DNA binding, and that it is possible to identify the protein-DNA binding sites with it.
引用
收藏
页码:99 / 105
页数:7
相关论文
共 50 条
  • [1] DRBpred: A sequence-based machine learning method to effectively predict DNA- and RNA-binding residues
    Ul Kabir, Md Wasi
    Alawad, Duaa Mohammad
    Pokhrel, Pujan
    Hoque, Md Tamjidul
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 170
  • [2] Sequence-based prediction of DNA-binding sites on DNA-binding proteins
    Gou, Z.
    Hwang, S.
    Kuznetsov, B., I
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 1, 2006, : 268 - +
  • [3] DP-Bind: a Web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins
    Hwang, Seungwoo
    Gou, Zhenkun
    Kuznetsov, Igor B.
    BIOINFORMATICS, 2007, 23 (05) : 634 - 636
  • [4] Sequence-Based Prediction of DNA-Binding Residues in Proteins with Conservation and Correlation Information
    Ma, Xin
    Guo, Jing
    Liu, Hong-De
    Xie, Jian-Ming
    Sun, Xiao
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (06) : 1766 - 1775
  • [5] ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors
    Chu, Wen-Yi
    Huang, Yu-Feng
    Huang, Chun-Chin
    Cheng, Yi-Sheng
    Huang, Chien-Kang
    Oyang, Yen-Jen
    NUCLEIC ACIDS RESEARCH, 2009, 37 : W396 - W401
  • [6] Sequence-based machine learning method for predicting the effects of phosphorylation on protein-protein interactions
    Hong, Xiaokun
    Lv, Jiyang
    Li, Zhengxin
    Xiong, Yi
    Zhang, Jian
    Chen, Hai-Feng
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2023, 243
  • [7] A novel prediction method for protein DNA-binding residues based on neighboring residue correlations
    Song, Jiazhi
    Liu, Guixia
    Jiang, Jingqing
    BIOTECHNOLOGY & BIOTECHNOLOGICAL EQUIPMENT, 2022, 36 (01) : 865 - 877
  • [8] Predicting Protein-DNA Binding Residues by Weightedly Combining Sequence-Based Features and Boosting Multiple SVMs
    Hu, Jun
    Li, Yang
    Zhang, Ming
    Yang, Xibei
    Shen, Hong-Bin
    Yu, Dong-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (06) : 1389 - 1398
  • [9] TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning
    Hu, Jun
    Zhou, Xiao-Gen
    Zhu, Yi-Heng
    Yu, Dong-Jun
    Zhang, Gui-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (04) : 1419 - 1429
  • [10] A Novel Sequence-Based Feature for the Identification of DNA-Binding Sites in Proteins Using Jensen-Shannon Divergence
    Dang, Truong Khanh Linh
    Meckbach, Cornelia
    Tacke, Rebecca
    Waack, Stephan
    Gueltas, Mehmet
    ENTROPY, 2016, 18 (10)