CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens

被引:29
|
作者
Wang, Yi-Wei [1 ,2 ,3 ]
Huang, Lei [4 ,5 ]
Jiang, Si-Wen [4 ]
Li, Kan [1 ,2 ]
Zou, Jun [1 ,2 ]
Yang, Sheng-Yong [1 ,2 ]
机构
[1] Sichuan Univ, West China Hosp, State Key Lab Biotherapy, Chengdu 610041, Sichuan, Peoples R China
[2] Sichuan Univ, West China Hosp, Canc Ctr, Chengdu 610041, Sichuan, Peoples R China
[3] Southwest Med Univ, Coll Preclin Med, Luzhou 646000, Sichuan, Peoples R China
[4] Univ Elect Sci & Technol China, Sch Comp Sci & Engineer, Chengdu 611731, Sichuan, Peoples R China
[5] Sichuan Coll Architectural Technol, Basic Teaching Dept, Deyang 61800, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Carcinogenicity; Predictive classifier; Capsule network; Computational toxicology; IN-SILICO PREDICTION; NONCONGENERIC CHEMICALS; MODELS; MUTAGENICITY; TOXICITY; BIOASSAY;
D O I
10.1016/j.fct.2019.110921
中图分类号
TS2 [食品工业];
学科分类号
0832 ;
摘要
Determining chemical carcinogenicity in the early stages of drug discovery is fundamentally important to prevent the adverse effect of carcinogens on human health. There has been a recent surge of interest in developing computational approaches to predict chemical carcinogenicity. However, the predictive power of many existing approaches is limited, and there is plenty of room for improvement. Here, we develop a new deep learning architecture, termed CapsCarcino, to distinguish between carcinogens and noncarcinogens. CapsCarcino is constructed based on a dynamic routing algorithm that requires less data, extracts more comprehensive information, and does not require feature selection. We find that CapsCarcino provides a significantly improved predictive and generalization ability over, and outperforms five other machine learning models. Specifically, the best model of CapsCarcino achieves an accuracy of 85.0% on an external validation dataset. In addition, we discover that the enhanced predictive capability of CapsCarcino over that of the other methods is robust and can be achieved using sparse datasets. Training on merely 20% of the dataset, CapsCarcino performs comparably to the other methods based on the full training dataset. Further mechanism analysis indicates that CapsCarcino could efficiently learn the characteristics of carcinogens even if structural alerts are insufficiently represented. The results indicate that CapsCarcino should be helpful for carcinogen risk assessment.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Sparse Hierarchical Table Ensemble-A Deep Learning Alternative for Tabular Data
    Farjon, Guy
    Bar-Hillel, Aharon
    IEEE ACCESS, 2022, 10 : 75376 - 75384
  • [42] XDL: An Industrial Deep Learning Framework for High-dimensional Sparse Data
    Jiang, Biye
    Deng, Chao
    Yi, Huimin
    Hu, Zelin
    Zhou, Guorui
    Zheng, Yang
    Huang, Sui
    Guo, Xinyang
    Wang, Dongyue
    Song, Yue
    Zhao, Liqin
    Wang, Zhi
    Sun, Peng
    Zhang, Yu
    Zhang, Di
    Li, Jinhui
    Xu, Jian
    Zhu, Xiaoqiang
    Gai, Kun
    1ST INTERNATIONAL WORKSHOP ON DEEP LEARNING PRACTICE FOR HIGH-DIMENSIONAL SPARSE DATA WITH KDD (DLP-KDD 2019), 2019,
  • [43] A generative deep learning framework for airfoil flow field prediction with sparse data
    Haizhou WU
    Xuejun LIU
    Wei AN
    Hongqiang LYU
    Chinese Journal of Aeronautics, 2022, (01) : 470 - 484
  • [44] A generative deep learning framework for airfoil flow field prediction with sparse data
    Wu, Haizhou
    Liu, Xuejun
    An, Wei
    Lyu, Hongqiang
    CHINESE JOURNAL OF AERONAUTICS, 2022, 35 (01) : 470 - 484
  • [45] Deep sparse representation via deep dictionary learning for reinforcement learning
    Tang, Jianhao
    Li, Zhenni
    Xie, Shengli
    Ding, Shuxue
    Zheng, Shaolong
    Chen, Xueni
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2398 - 2403
  • [46] Predicting phenotypes from novel genomic markers using deep learning
    Sehrawat, Shivani
    Najafian, Keyhan
    Jin, Lingling
    BIOINFORMATICS ADVANCES, 2023, 3 (01):
  • [47] Predicting the structure of unexplored novel fentanyl analogues by deep learning model
    Zhang, Yuan
    Jiang, Qiaoyan
    Li, Ling
    Li, Zutan
    Xu, Zhihui
    Chen, Yuanyuan
    Sun, Yang
    Liu, Cheng
    Mao, Zhengsheng
    Chen, Feng
    Li, Hualan
    Cao, Yue
    Pian, Cong
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
  • [48] Sparse GPU Kernels for Deep Learning
    Gale, Trevor
    Zaharia, Matei
    Young, Cliff
    Elsen, Erich
    PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [49] SpineNet-6mA: A Novel Deep Learning Tool for Predicting DNA N6-Methyladenine Sites in Genomes
    Abbas, Zeeshan
    Tayara, Hilal
    Chong, Kil To
    IEEE ACCESS, 2020, 8 : 201450 - 201457
  • [50] Posterior Concentration for Sparse Deep Learning
    Polson, Nicholas G.
    Rockova, Veronika
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31