Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity

被引:3
|
作者
Kalian, Alexander D. [1 ]
Benfenati, Emilio [2 ]
Osborne, Olivia J. [3 ]
Gott, David [3 ]
Potter, Claire [3 ]
Dorne, Jean-Lou C. M. [4 ]
Guo, Miao [5 ]
Hogstrand, Christer [6 ]
机构
[1] Kings Coll London, Dept Nutr Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
[2] Ist Ric Farmacolog Mario Negri IRCCS, Via Mario Negri 2, I-20156 Milan, Italy
[3] Food Stand Agcy, 70 Petty France, London SW1H 9EX, England
[4] European Food Safety Author EFSA, Via Carlo Magno 1A, I-43126 Parma, Italy
[5] Kings Coll London, Dept Engn, Strand Campus, London WC2R 2LS, England
[6] Kings Coll London, Dept Analyt Environm & Forens Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
基金
英国生物技术与生命科学研究理事会;
关键词
QSAR; dimensionality reduction; deep learning; autoencoder; principal component analysis; locally linear embedding; grid search; hyperparameter optimisation; mutagenicity; cheminformatics;
D O I
10.3390/toxics11070572
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover's theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Could deep learning in neural networks improve the QSAR models?
    Gini, G.
    Zanoli, F.
    Gamba, A.
    Raitano, G.
    Benfenati, E.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2019, 30 (09) : 617 - 642
  • [22] Machine Learning Models and Dimensionality Reduction for Prediction of Polymer Properties
    Mysona, Joshua A.
    Nealey, Paul F.
    de Pablo, Juan J.
    MACROMOLECULES, 2024, 57 (05) : 1988 - 1997
  • [23] Structural health monitoring by combining machine learning and dimensionality reduction techniques
    Quaranta, Giacomo
    Lopez, Elena
    Abisse-Chavanne, Emmanuelle
    Duval, Jean Louis
    Huerta, Antonio
    Chinesta, Francisco
    REVISTA INTERNACIONAL DE METODOS NUMERICOS PARA CALCULO Y DISENO EN INGENIERIA, 2019, 35 (01):
  • [24] The Effect of Different Dimensionality Reduction Techniques on Machine Learning Overfitting Problem
    Salam M.A.
    Azar A.T.
    Elgendy M.S.
    Fouad K.M.
    International Journal of Advanced Computer Science and Applications, 2021, 12 (04): : 641 - 655
  • [25] The Effect of Different Dimensionality Reduction Techniques on Machine Learning Overfitting Problem
    Salam, Mustafa Abdul
    Azar, Ahmad Taher
    Elgendy, Mustafa Samy
    Fouad, Khaled Mohamed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (04) : 641 - 655
  • [26] Gaining deep knowledge of Android malware families through dimensionality reduction techniques
    Vega Vega, Rafael
    Quintian, Hector
    Luis Calvo-Rolle, Jose
    Herrero, Alvaro
    Corchado, Emilio
    LOGIC JOURNAL OF THE IGPL, 2019, 27 (02) : 160 - 176
  • [27] Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures
    Yashar Kiarashinejad
    Sajjad Abdollahramezani
    Ali Adibi
    npj Computational Materials, 6
  • [28] Deep Learning Approach to Temporal Dimensionality Reduction of Volumetric Computed Tomography
    da Silva, Lucas Almeida
    dos Santos, Eulanda Miranda
    Giusti, Rafael
    INTELLIGENT SYSTEMS, BRACIS 2024, PT I, 2025, 15412 : 296 - 309
  • [29] Dimensionality Reduction of Deep Learning for Earth Observation: Smaller, Faster, Simpler
    Calota, Iulia
    Faur, Daniela
    Datcu, Mihai
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 4484 - 4498
  • [30] Data Imputation and Dimensionality Reduction Using Deep Learning in Industrial Data
    Zhou, Zhihong
    Mo, Jiao
    Shi, Yijie
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2329 - 2333