CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens

被引:29
|
作者
Wang, Yi-Wei [1 ,2 ,3 ]
Huang, Lei [4 ,5 ]
Jiang, Si-Wen [4 ]
Li, Kan [1 ,2 ]
Zou, Jun [1 ,2 ]
Yang, Sheng-Yong [1 ,2 ]
机构
[1] Sichuan Univ, West China Hosp, State Key Lab Biotherapy, Chengdu 610041, Sichuan, Peoples R China
[2] Sichuan Univ, West China Hosp, Canc Ctr, Chengdu 610041, Sichuan, Peoples R China
[3] Southwest Med Univ, Coll Preclin Med, Luzhou 646000, Sichuan, Peoples R China
[4] Univ Elect Sci & Technol China, Sch Comp Sci & Engineer, Chengdu 611731, Sichuan, Peoples R China
[5] Sichuan Coll Architectural Technol, Basic Teaching Dept, Deyang 61800, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Carcinogenicity; Predictive classifier; Capsule network; Computational toxicology; IN-SILICO PREDICTION; NONCONGENERIC CHEMICALS; MODELS; MUTAGENICITY; TOXICITY; BIOASSAY;
D O I
10.1016/j.fct.2019.110921
中图分类号
TS2 [食品工业];
学科分类号
0832 ;
摘要
Determining chemical carcinogenicity in the early stages of drug discovery is fundamentally important to prevent the adverse effect of carcinogens on human health. There has been a recent surge of interest in developing computational approaches to predict chemical carcinogenicity. However, the predictive power of many existing approaches is limited, and there is plenty of room for improvement. Here, we develop a new deep learning architecture, termed CapsCarcino, to distinguish between carcinogens and noncarcinogens. CapsCarcino is constructed based on a dynamic routing algorithm that requires less data, extracts more comprehensive information, and does not require feature selection. We find that CapsCarcino provides a significantly improved predictive and generalization ability over, and outperforms five other machine learning models. Specifically, the best model of CapsCarcino achieves an accuracy of 85.0% on an external validation dataset. In addition, we discover that the enhanced predictive capability of CapsCarcino over that of the other methods is robust and can be achieved using sparse datasets. Training on merely 20% of the dataset, CapsCarcino performs comparably to the other methods based on the full training dataset. Further mechanism analysis indicates that CapsCarcino could efficiently learn the characteristics of carcinogens even if structural alerts are insufficiently represented. The results indicate that CapsCarcino should be helpful for carcinogen risk assessment.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Deep learning optoacoustic tomography with sparse data
    Neda Davoudi
    Xosé Luís Deán-Ben
    Daniel Razansky
    Nature Machine Intelligence, 2019, 1 : 453 - 460
  • [2] Deep Learning on Big, Sparse, Behavioral Data
    De Cnudde, Sofie
    Ramon, Yanou
    Martens, David
    Provost, Foster
    BIG DATA, 2019, 7 (04) : 286 - 307
  • [3] Deep learning optoacoustic tomography with sparse data
    Davoudi, Neda
    Dean-Ben, Xose Luis
    Razansky, Daniel
    NATURE MACHINE INTELLIGENCE, 2019, 1 (10) : 453 - 460
  • [4] Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method
    Limbu, Sarita
    Dakshanamurthy, Sivanesan
    SENSORS, 2022, 22 (21)
  • [5] Deep learning for photoacoustic tomography from sparse data
    Antholzer, Stephan
    Haltmeier, Markus
    Schwab, Johannes
    INVERSE PROBLEMS IN SCIENCE AND ENGINEERING, 2019, 27 (07) : 987 - 1005
  • [6] Employing deep learning and sparse representation for data classification
    Fard, Seyed Mehdi Hazrati
    Hashemi, Sattar
    2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2017, : 289 - 293
  • [7] Predicting with sparse data
    Shepperd, M
    Cartwright, M
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2001, 27 (11) : 987 - 998
  • [8] Predicting with sparse data
    Shepperd, M
    Cartwright, M
    SEVENTH INTERNATIONAL SOFTWARE METRICS SYMPOSIUM - METRICS 2001, PROCEEDINGS, 2000, : 28 - 39
  • [9] DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems
    Mohammed, Thaha
    Albeshri, Aiiad
    Katib, Iyad
    Mehmood, Rashid
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (06): : 6313 - 6355
  • [10] DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems
    Thaha Mohammed
    Aiiad Albeshri
    Iyad Katib
    Rashid Mehmood
    The Journal of Supercomputing, 2021, 77 : 6313 - 6355