On application of constitutional descriptors for merging of quinoxaline data sets, using linear statistical methods

被引:4
|
作者
Ghosh, Payell [1 ]
Vracko, Marjan [2 ]
Chattopadhyay, Asis Kumar [3 ]
Bagchi, Manish C. [1 ]
机构
[1] Indian Inst Chem Biol, Struct Biol & Bioinformat Div, Kolkata 700032, India
[2] Natl Inst Chem, Lab Chemometr, Ljubljana 1000, Slovenia
[3] Univ Calcutta, Univ Coll Sci, Dept Stat, Kolkata 700019, India
关键词
principal component analysis; partial least squares; quantitative structure-activity relationship; quinoxaline compounds; theoretical molecular descriptors;
D O I
10.1111/j.1747-0285.2008.00686.x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The present paper is an attempt for unifying two different quinoxaline data sets with a wide range of substituents in 2, 3, 7, and 8 positions having excellent antitubercular activities with a view to developing robust and reliable structure-activity relationships. The merging has been performed for these two sets of quinoxaline 1,4-di-N-oxides derivatives comprising 29 and 18 compounds, respectively, on the basis of constitutional descriptors, which denotes the structural characterization of the molecules. Principal component analysis was performed to see the distribution of the compounds from two data sets for the constitutional descriptors. The distribution of compounds in score plot based on constitutional descriptors suggests unification of quinoxaline data sets which is useful for the model development. Outlier detection was performed from the standpoint of residual analysis of the partial least squares regression models. The superiority of the constitutional descriptors over other calculated molecular descriptors has been established from the standpoint of leave-one-out cross-validation technique associated with partial least squares regression analysis. Internal validation through the leave-many-out methodology was also performed with good results, assuring the stability of the models. The results obtained from linear partial least squares regression analysis lead to a statistically significant and robust quantitative structure-activity relationship modeling.
引用
收藏
页码:155 / 162
页数:8
相关论文
共 50 条
  • [21] THE STATISTICAL PROPERTIES OF DIMENSION CALCULATIONS USING SMALL DATA SETS
    RAMSEY, JB
    YUAN, HJ
    NONLINEARITY, 1990, 3 (01) : 155 - 176
  • [22] Nonlinear projection methods for visualizing Barcode data and application on two data sets
    Olteanu, Madalina
    Nicolas, Violaine
    Schaeffer, Brigitte
    Denys, Christiane
    Missoup, Alain-Didier
    Kennis, Jan
    Laredo, Catherine
    MOLECULAR ECOLOGY RESOURCES, 2013, 13 (06) : 976 - 990
  • [23] Introduction to Statistical Methods to Analyze Large Data Sets: Principal Components Analysis
    Clark, Neil R.
    Ma'ayan, Avi
    SCIENCE SIGNALING, 2011, 4 (190)
  • [24] Statistical Behavior analysis of smoothing methods for language models of mandarin data sets
    Yu, Ming-Shing
    Huang, Feng-Long
    Tsai, Piyu
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 172 - 186
  • [25] Integrating geophysical data sets using probabilistic methods
    Pendock, N
    Nedeljkovic, V
    PROCEEDINGS OF THE ELEVENTH THEMATIC CONFERENCE - GEOLOGIC REMOTE SENSING: PRACTICAL SOLUTIONS FOR REAL WORLD PROBLEMS, VOL II, 1996, : 621 - 628
  • [26] Integrating geophysical data sets using probabilistic methods
    Pendock, N
    Nedeljkovic, V
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 1997, 18 (07) : 1627 - 1635
  • [27] Integrating geophysical data sets using probabilistic methods
    Dept of Comp + Applied Maths, Univ of Witswatersrand, PO Box 382, Wits 2050, Johannesburg, South Africa
    Int J Remote Sens, 7 (1627-1635):
  • [28] Analysis of Neurosurgery Data Using Statistical and Data Mining Methods
    Berka, Petr
    Jablonsky, Josef
    Marek, Lubos
    Vrabec, Michal
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND ITS APPLICATIONS, MICAI 2015, PT II, 2015, 9414 : 310 - 321
  • [29] Evaluating Statistical Methods Using Plasmode Data Sets in the Age of Massive Public Databases: An Illustration Using False Discovery Rates
    Gadbury, Gary L.
    Xiang, Qinfang
    Yang, Lin
    Barnes, Stephen
    Page, Grier P.
    Allison, David B.
    PLOS GENETICS, 2008, 4 (06):
  • [30] Application of Statistical Machine Learning Algorithms for Classification of Bridge Deformation Data Sets
    Avendano, Juan C.
    Otero, Luis Daniel
    Otero, Carlos
    2021 15TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON 2021), 2021,