SFM: A novel sequence-based fusion method for disease genes identification and prioritization

被引:10
|
作者
Yousef, Abdulaziz [1 ]
Charkari, Nasrollah Moghadam [1 ]
机构
[1] Tarbiat Modares Univ, Fac Elect & Comp Engn, Tehran, Iran
关键词
Classification; Disease gene; Protein; Physicochemical properties of amino acid; Fusion method; PROTEIN-PROTEIN INTERACTIONS; PREDICTION; FEATURES; AUTOCORRELATION; CLASSIFICATION; SIMILARITY; SURFACE;
D O I
10.1016/j.jtbi.2015.07.010
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The identification of disease genes from human genome is of great importance to improve diagnosis and treatment of disease. Several machine learning methods have been introduced to identify disease genes. However, these methods mostly differ in the prior knowledge used to construct the feature vector for each instance (gene), the ways of selecting negative data (non-disease genes) where there is no investigational approach to find them and the classification methods used to make the final decision. In this work, a novel Sequence-based fusion method (SFM) is proposed to identify disease genes. In this regard, unlike existing methods, instead of using a noisy and incomplete prior-knowledge, the amino acid sequence of the proteins which is universal data has been carried out to present the genes (proteins) into four different feature vectors. To select more likely negative data from candidate genes, the intersection set of four negative sets which are generated using distance approach is considered. Then, Decision Tree (C4.5) has been applied as a fusion method to combine the results of four independent state-of the-art predictors based on support vector machine (SVM) algorithm, and to make the final decision. The experimental results of the proposed method have been evaluated by some standard measures. The results indicate the precision, recall and F-measure of 82.6%, 85.6% and 84, respectively. These results confirm the efficiency and validity of the proposed method. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:12 / 19
页数:8
相关论文
共 50 条
  • [21] NUCLEOTIDE SEQUENCE-BASED APPROACHES TO HERBAL IDENTIFICATION
    Cimino, Matthew T.
    PHARMACEUTICAL BIOLOGY, 2009, 47 : 19 - 19
  • [22] Speech identification using a sequence-based heuristic
    Heinrich, G
    Proceedings ELMAR-2005, 2005, : 225 - 228
  • [23] A universal PCR method and its application in sequence-based identification of microorganisms in dairy
    Zhang, Hongfa
    You, Chunping
    INTERNATIONAL DAIRY JOURNAL, 2018, 85 : 41 - 48
  • [24] Sequence-Based Intelligent Model for Identification of Tumor T Cell Antigens Using Fusion Features
    Bibi, Nagina
    Khan, Mukhtaj
    Khan, Salman
    Noor, Sumaiya
    Alqahtani, Salman A.
    Ali, Abid
    Iqbal, Nadeem
    IEEE ACCESS, 2024, 12 : 155040 - 155051
  • [25] A sequence-based map of Arabidopsis genes with mutant phenotypes
    Meinke, DW
    Meinke, LK
    Showalter, TC
    Schissel, AM
    Mueller, LA
    Tzafrir, I
    PLANT PHYSIOLOGY, 2003, 131 (02) : 409 - 418
  • [26] A Novel Sequence-Based Method for Phosphorylation Site Prediction with Feature Selection and Analysis
    He, Zhi-Song
    Shi, Xiao-He
    Kong, Xiang-Ying
    Zhu, Yu-Bei
    Chou, Kuo-Chen
    PROTEIN AND PEPTIDE LETTERS, 2012, 19 (01): : 70 - 78
  • [27] Identification of a novel allele, HLA-A*24:02:50, by sequence-based typing
    Li, J. -P.
    Li, X. -F.
    Zhang, X.
    Lin, F. -Q.
    Zhang, K. -L.
    HLA, 2016, 87 (05) : 388 - U147
  • [28] Sequence-based typing identification of the novel allele HLA-B*40:482
    Shi, Xiu-Min
    Hu, Rui-Ping
    Li, Pei-Tong
    Han, Wei
    Gao, Su-Jun
    HLA, 2022, 100 (03) : 270 - 271
  • [29] Identification of the novel allele HLA-B*13:01:03 by sequence-based typing method in a Taiwanese individual
    Chu, C. -C.
    Lee, K. -C.
    Lee, H. -L.
    Lai, S. -K.
    Lin, M.
    TISSUE ANTIGENS, 2010, 76 (06): : 496 - 497
  • [30] Identification of a novel HLA-DRB1*14 allele by sequence-based typing
    Karvunidis, T
    Jindra, P
    Dorner, A
    Fischer, G
    Koza, V
    GENES AND IMMUNITY, 2005, 6 : S14 - S14