Alignment-Free Prediction of Polygalacturonases with Pseudofolding Topological Indices: Experimental Isolation from Coffea arabica and Prediction of a New Sequence

被引:58
作者
Agueero-Chapin, Guillermin [1 ,2 ,3 ]
Varona-Santos, Javier [4 ]
de la Riva, Gustavo A. [5 ]
Antunes, Agostinho [3 ]
Gonzalez-Villa, Tomas [6 ]
Uriarte, Eugenio [1 ]
Gonzalez-Diaz, Humberto [1 ,6 ]
机构
[1] Univ Santiago de Compostela, UBICA, Inst Ind Pharm, Dept Organ Chem,Fac Pharm, Santiago De Compostela 15782, Spain
[2] Cent Univ Las Villas, CBQ, Santa Clara 54830, Cuba
[3] Univ Porto, CIMAR, Ctr Interdisciplinar Invest Marinha & Ambiental, P-4050123 Oporto, Portugal
[4] Univ Miami, Sylvester Comprehens Canc Ctr, Miami, FL 33136 USA
[5] ITESS, Guanajuato 38900, Mexico
[6] Univ Santiago de Compostela, Dept Microbiol & Parasitol, Fac Pharm, Santiago De Compostela 15782, Spain
关键词
Markov models; Topological Indices; Protein folding lattice networks; QSAR; Spectral moments; Entropy; Artificial Neural Networks; Polygalacturonases of plant; bacterial; nematode; PROTEIN SUBCELLULAR LOCATION; WEB-SERVER; SCLEROTINIA-SCLEROTIORUM; MEMBRANE-PROTEINS; ENZYME-KINETICS; GRAPH-THEORY; RATE LAWS; QSAR; NETWORKS; MODEL;
D O I
10.1021/pr800867y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Polygalacturonases (PGs) have called the attention of microbiology scientists and biotechnology or pharmaceutical industry because they are protein enzymes relevant to phytopathogens invasion, fruit ripening, and potential antimicrobial drug targets. Numeric Topological Indices (TIs) of protein pseudofolding lattices can be used as input for classification algorithms in Quantitative Structure-Activity Relationship (QSAR) studies. However, a comparative study of different QSAR models for PGs has not been reported. In this study, we calculated for the first time two classes of TIs (Spectral moments (pi(k)) and Entropy (theta(k)) values) for the Markov matrices associated to pseudofolding lattices of 108 PGs and 100 non-PGs heterogeneous proteins. Afterward, we developed different linear classifiers based on Linear Discriminant Analysis (LDA) and four types of nonlinear Artificial Neural Networks (ANN). The pi(k)-LDA model correctly classified 98.8% of PGs and 100% non-PGs used to train the model, as well as 98.1% of all sequences used as external validation series. The pi(k)-LDA model was the more accurate and/or simpler found. In addition, we report for the first time the experimental isolation and successful prediction of a new PG sequence from Coffea arabica. This sequence was deposited in the GenBank by our group with accession number GDQ336394. The present type of models are an interesting alignment-free complement to alignment-based procedures.
引用
收藏
页码:2122 / 2128
页数:7
相关论文
共 65 条
[1]   MMM-QSAR recognition of ribonucleases without alignment:: Comparison with an HMM model and isolation from Schizosaccharomyces pombe, prediction, and experimental assay of a new sequence [J].
Agueero-Chapin, Guillemin ;
Gonzalez-Diaz, Humberto ;
de la Riva, Gustavo ;
Rodriguez, Edrey ;
Sanchez-Rodriguez, Aminael ;
Podda, Gianni ;
Vazquez-Padron, Roberto I. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (02) :434-448
[2]   Novel 2D maps and coupling numbers for protein sequences.: The first QSAR study of polygalacturonases;: isolation and prediction of a novel sequence from Psidium guajava']java L. [J].
Agüero-Chapin, GA ;
González-Díaz, H ;
Molina, R ;
Varona-Santos, J ;
Uriarte, E ;
González-Díaz, Y .
FEBS LETTERS, 2006, 580 (03) :723-730
[3]   The benzylthio-pyrimidine U-31,355, a potent inhibitor of HIV-1 reverse transcriptase [J].
Althaus, IW ;
Chou, KC ;
Lemay, RJ ;
Franks, KM ;
Deibel, MR ;
Kezdy, FJ ;
Resnick, L ;
Busso, ME ;
So, AG ;
Downey, KM ;
Romero, DL ;
Thomas, RC ;
Aristoff, PA ;
Tarpley, WG ;
Reusser, F .
BIOCHEMICAL PHARMACOLOGY, 1996, 51 (06) :743-750
[4]   Kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws - New methods based on directed graphs [J].
Andraos, John .
CANADIAN JOURNAL OF CHEMISTRY, 2008, 86 (04) :342-357
[5]   GENBANK [J].
BENSON, D ;
LIPMAN, DJ ;
OSTELL, J .
NUCLEIC ACIDS RESEARCH, 1993, 21 (13) :2963-2965
[6]   GenBank [J].
Benson, DA ;
Boguski, MS ;
Lipman, DJ ;
Ostell, J ;
Ouellette, BFF ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :12-17
[7]   Cell wall metabolism in fruit softening and quality and its manipulation in transgenic plants [J].
Brummell, DA ;
Harpster, MH .
PLANT MOLECULAR BIOLOGY, 2001, 47 (1-2) :311-340
[8]   Amino acid sequence autocorrelation vectors and ensembles of Bayesian-regularized genetic neural networks for prediction of conformational stability of human lysozyme mutants [J].
Caballero, Julio ;
Fernandez, Leyden ;
Abreu, Jose Ignacio ;
Fernandez, Michael .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (03) :1255-1268
[9]   TOMOCOMD-CARDD descriptors-based virtual screening of tyrosinase inhibitors:: Evaluation of different classification model combinations using bond-based linear indices [J].
Casahola-Martin, Gerardo M. ;
Marrero-Ponce, Yovani ;
Khan, Mahmud Tareq Hassan ;
Ather, Arjumand ;
Sultan, Sadia ;
Torrens, Francisco ;
Rotondo, Richard .
BIOORGANIC & MEDICINAL CHEMISTRY, 2007, 15 (03) :1483-1503
[10]  
Chen Mao, 2005, Genomics Proteomics & Bioinformatics, V3, P225