i6mA-DNCP: Computational Identification of DNA N6-Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features

被引:33
|
作者
Kong, Liang [1 ]
Zhang, Lichao [2 ,3 ]
机构
[1] Hebei Normal Univ Sci & Technol, Sch Math & Informat Sci & Technol, Qinhuangdao 066004, Hebei, Peoples R China
[2] Northeastern Univ Qinhuangdao, Sch Math & Stat, Qinhuangdao 066004, Hebei, Peoples R China
[3] Northeastern Univ, Coll Sci, Shenyang 110819, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
N-6-methyladenine; dinucleotide composition; DNA properties; bagging; START SITES; METHYLATION; N-6-ADENINE; N6-METHYLADENINE; PSEKNC; MODES;
D O I
10.3390/genes10100828
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
DNA N-6-methyladenine (6mA) plays an important role in regulating the gene expression of eukaryotes. Accurate identification of 6mA sites may assist in understanding genomic 6mA distributions and biological functions. Various experimental methods have been applied to detect 6mA sites in a genome-wide scope, but they are too time-consuming and expensive. Developing computational methods to rapidly identify 6mA sites is needed. In this paper, a new machine learning-based method, i6mA-DNCP, was proposed for identifying 6mA sites in the rice genome. Dinucleotide composition and dinucleotide-based DNA properties were first employed to represent DNA sequences. After a specially designed DNA property selection process, a bagging classifier was used to build the prediction model. The jackknife test on a benchmark dataset demonstrated that i6mA-DNCP could obtain 84.43% sensitivity, 88.86% specificity, 86.65% accuracy, a 0.734 Matthew's correlation coefficient (MCC), and a 0.926 area under the receiver operating characteristic curve (AUC). Moreover, three independent datasets were established to assess the generalization ability of our method. Extensive experiments validated the effectiveness of i6mA-DNCP.
引用
收藏
页数:13
相关论文
共 42 条
  • [1] iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice
    Lv, Hao
    Dao, Fu-Ying
    Guan, Zheng-Xing
    Zhang, Dan
    Tan, Jiu-Xin
    Zhang, Yong
    Chen, Wei
    Lin, Hao
    FRONTIERS IN GENETICS, 2019, 10
  • [2] i6mA-stack: A stacking ensemble-based computational prediction of DNA N6-methyladenine (6mA) sites in the Rosaceae genome
    Khanal, Jhabindra
    Lim, Dae Young
    Tayara, Hilal
    Chong, Kil To
    GENOMICS, 2021, 113 (01) : 582 - 592
  • [3] SNNRice6mA: A Deep Learning Method for Predicting DNA N6-Methyladenine Sites in Rice Genome
    Yu, Haitao
    Dai, Zhiming
    FRONTIERS IN GENETICS, 2019, 10
  • [4] Identification of DNA N6-methyladenine sites by integration of sequence features
    Wang, Hao-Tian
    Xiao, Fu-Hui
    Li, Gong-Hua
    Kong, Qing-Peng
    EPIGENETICS & CHROMATIN, 2020, 13 (01)
  • [5] Identification of DNA N6-methyladenine sites by integration of sequence features
    Hao-Tian Wang
    Fu-Hui Xiao
    Gong-Hua Li
    Qing-Peng Kong
    Epigenetics & Chromatin, 13
  • [6] i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites
    Xue, Tian
    Zhang, Shengli
    Qiao, Huijuan
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2021, 13 (03) : 413 - 425
  • [7] i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites
    Tian Xue
    Shengli Zhang
    Huijuan Qiao
    Interdisciplinary Sciences: Computational Life Sciences, 2021, 13 : 413 - 425
  • [8] GC6mA-Pred: A deep learning approach to identify DNA N6-methyladenine sites in the rice genome
    Cai, Jianhua
    Xiao, Guobao
    Su, Ran
    METHODS, 2022, 204 : 14 - 21
  • [9] A convolution based computational approach towards DNA N6-methyladenine site identification and motif extraction in rice genome
    Rahman, Chowdhury Rafeed
    Amin, Ruhul
    Shatabda, Swakkhar
    Toaha, Md Sadrul Islam
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [10] i6mA-Pred: identifying DNA N6 - methyladenine sites in the rice genome
    Chen, Wei
    Lv, Hao
    Nie, Fulei
    Lin, Hao
    BIOINFORMATICS, 2019, 35 (16) : 2796 - 2800