MMM-QSAR recognition of ribonucleases without alignment:: Comparison with an HMM model and isolation from Schizosaccharomyces pombe, prediction, and experimental assay of a new sequence

被引:37
作者
Agueero-Chapin, Guillemin [2 ,3 ,4 ]
Gonzalez-Diaz, Humberto [1 ,2 ,5 ]
de la Riva, Gustavo [6 ]
Rodriguez, Edrey [7 ]
Sanchez-Rodriguez, Aminael [3 ,4 ]
Podda, Gianni [2 ]
Vazquez-Padron, Roberto I. [8 ]
机构
[1] Univ Santiago de Compostela, Fac Pharm, Inst Ind Pharm, Santiago De Compostela 15782, Spain
[2] Univ Cagliari, Dipartimento Farmaco Chim Tecnol, I-09124 Cagliari, Italy
[3] UCLV, CBQ, Santa Clara, CA USA
[4] UCLV, IBP, Fac Chem & Pharm, CAP, Santa Clara, CA USA
[5] Univ Santiago de Compostela, Fac Pharm, Dept Organ Chem, Santiago De Compostela 15782, Spain
[6] LANGEBIO, CINVESTAV, Guanajuato 36821, Mexico
[7] Caribbean Vitroplants, Santo Domingo 1464, Dominican Rep
[8] Univ Miami, Sch Med, Vasc Biol Inst, Miami, FL 33136 USA
关键词
D O I
10.1021/ci7003225
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The study of type III RNases constitutes an important area in molecular biology. It is known that the pac1(+) aene encodes a particular RNase III that shares low amino acid similarity with other genes despite having a double-stranded ribonuclease activity. Bioinformatics methods based on sequence alignment may fail when there is a low amino acidic identity percentage between a query sequence and others with similar functions (remote homologues) or a similar sequence is not recorded in the database. Quantitative structure-activity relationships (QSAR) applied to protein sequences may allow an alignment-independent prediction of protein function. These sequences of QSAR-like methods often use 1D sequence numerical parameters as the input to seek sequence-function relationships. However, previous 2D representation of sequences may uncover useful higher-order information. In the work described here we calculated for the first time the spectral moments of a Markov matrix (MMM) associated with a 2D-HP-map of a protein sequence. We used MMMs values to characterize numerically 81 sequences of type III RNases and 133 proteins of a control group. We subsequently developed one MMM-QSAR and one classic hidden Markov model (HMM) based on the same data. The MMM-QSAR showed a discrimination power of RNAses from other proteins of 97.35% without using alignment, which is a result as good as for the known HMM techniques. We also report for the first time the isolation of a new Pac2 protein (DQ647826) from Schizosaccharomyces pombe strain 428-4-1. The MMM-QSAR model predicts the new RNase III with the same accuracy as other classical alignment methods. Experimental assay of this protein confirms the predicted activity. The present results suggest that MMM-QSAR models may be used for protein function annotation avoiding sequence alignment with the same accuracy of classic HMM models.
引用
收藏
页码:434 / 448
页数:15
相关论文
共 196 条
[1]   Parallels in rRNA processing:: Conserved features in the processing of the internal transcribed spacer 1 in the pre-rRNA from Schizosaccharomyces pombe [J].
Abeyrathne, PD ;
Nazar, RN .
BIOCHEMISTRY, 2005, 44 (51) :16977-16987
[2]   Novel 2D maps and coupling numbers for protein sequences.: The first QSAR study of polygalacturonases;: isolation and prediction of a novel sequence from Psidium guajava']java L. [J].
Agüero-Chapin, GA ;
González-Díaz, H ;
Molina, R ;
Varona-Santos, J ;
Uriarte, E ;
González-Díaz, Y .
FEBS LETTERS, 2006, 580 (03) :723-730
[3]  
[Anonymous], MOLECULES
[4]  
[Anonymous], 1989, Molecular Cloning
[5]   Probabilistic methods of identifying genes in prokaryotic genomes: Connections to the FIMM theory [J].
Azad, RK ;
Borodovsky, M .
BRIEFINGS IN BIOINFORMATICS, 2004, 5 (02) :118-130
[6]   Fungal BLAST and Model Organism BLASTP Best Hits:: new comparison resources at the Saccharomyces Genome Database (SGD) [J].
Balakrishnan, R ;
Christie, KR ;
Costanzo, MC ;
Dolinski, K ;
Dwight, SS ;
Engel, SR ;
Fisk, DG ;
Hirschman, JE ;
Hong, EL ;
Nash, R ;
Oughtred, R ;
Skrzypek, M ;
Theesfeld, CL ;
Binkley, G ;
Dong, Q ;
Lane, C ;
Sethuraman, A ;
Weng, S ;
Botstein, D ;
Cherry, JM .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D374-D377
[7]  
Bateman Alex, 2002, Brief Bioinform, V3, P236, DOI 10.1093/bib/3.3.236
[8]   Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete [J].
Berger, B ;
Leighton, T .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (01) :27-40
[9]   CONTROL OF REPLICATION OF PLASMID R1 - THE DUPLEX BETWEEN THE ANTISENSE RNA, COPA, AND ITS TARGET, COPT, IS PROCESSED SPECIFICALLY INVIVO AND INVITRO BY RNASE-III [J].
BLOMBERG, P ;
WAGNER, EGH ;
NORDSTROM, K .
EMBO JOURNAL, 1990, 9 (07) :2331-2340
[10]   HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins [J].
Bystroff, C ;
Thorsson, V ;
Baker, D .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 301 (01) :173-190