CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens

被引：29

作者：

Wang, Yi-Wei ^{[1
,2
,3
]}

Huang, Lei ^{[4
,5
]}

Jiang, Si-Wen ^{[4
]}

Li, Kan ^{[1
,2
]}

Zou, Jun ^{[1
,2
]}

Yang, Sheng-Yong ^{[1
,2
]}

机构：

[1] Sichuan Univ, West China Hosp, State Key Lab Biotherapy, Chengdu 610041, Sichuan, Peoples R China

[2] Sichuan Univ, West China Hosp, Canc Ctr, Chengdu 610041, Sichuan, Peoples R China

[3] Southwest Med Univ, Coll Preclin Med, Luzhou 646000, Sichuan, Peoples R China

[4] Univ Elect Sci & Technol China, Sch Comp Sci & Engineer, Chengdu 611731, Sichuan, Peoples R China

[5] Sichuan Coll Architectural Technol, Basic Teaching Dept, Deyang 61800, Sichuan, Peoples R China

来源：

FOOD AND CHEMICAL TOXICOLOGY | 2020年 / 135卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; Carcinogenicity; Predictive classifier; Capsule network; Computational toxicology; IN-SILICO PREDICTION; NONCONGENERIC CHEMICALS; MODELS; MUTAGENICITY; TOXICITY; BIOASSAY;

D O I：

10.1016/j.fct.2019.110921

中图分类号：

TS2 [食品工业];

学科分类号：

0832 ;

摘要：

Determining chemical carcinogenicity in the early stages of drug discovery is fundamentally important to prevent the adverse effect of carcinogens on human health. There has been a recent surge of interest in developing computational approaches to predict chemical carcinogenicity. However, the predictive power of many existing approaches is limited, and there is plenty of room for improvement. Here, we develop a new deep learning architecture, termed CapsCarcino, to distinguish between carcinogens and noncarcinogens. CapsCarcino is constructed based on a dynamic routing algorithm that requires less data, extracts more comprehensive information, and does not require feature selection. We find that CapsCarcino provides a significantly improved predictive and generalization ability over, and outperforms five other machine learning models. Specifically, the best model of CapsCarcino achieves an accuracy of 85.0% on an external validation dataset. In addition, we discover that the enhanced predictive capability of CapsCarcino over that of the other methods is robust and can be achieved using sparse datasets. Training on merely 20% of the dataset, CapsCarcino performs comparably to the other methods based on the full training dataset. Further mechanism analysis indicates that CapsCarcino could efficiently learn the characteristics of carcinogens even if structural alerts are insufficiently represented. The results indicate that CapsCarcino should be helpful for carcinogen risk assessment.

引用

页数：11

共 50 条

[41] Sparse Hierarchical Table Ensemble-A Deep Learning Alternative for Tabular Data
Farjon, Guy
Bar-Hillel, Aharon
IEEE ACCESS, 2022, 10 : 75376 - 75384
[42] XDL: An Industrial Deep Learning Framework for High-dimensional Sparse Data
Jiang, Biye
Deng, Chao
Yi, Huimin
Hu, Zelin
Zhou, Guorui
Zheng, Yang
Huang, Sui
Guo, Xinyang
Wang, Dongyue
Song, Yue
Zhao, Liqin
Wang, Zhi
Sun, Peng
Zhang, Yu
Zhang, Di
Li, Jinhui
Xu, Jian
Zhu, Xiaoqiang
Gai, Kun
1ST INTERNATIONAL WORKSHOP ON DEEP LEARNING PRACTICE FOR HIGH-DIMENSIONAL SPARSE DATA WITH KDD (DLP-KDD 2019), 2019,
[43] A generative deep learning framework for airfoil flow field prediction with sparse data
Haizhou WU
Xuejun LIU
Wei AN
Hongqiang LYU
Chinese Journal of Aeronautics, 2022, (01) : 470 - 484
[44] A generative deep learning framework for airfoil flow field prediction with sparse data
Wu, Haizhou
Liu, Xuejun
An, Wei
Lyu, Hongqiang
CHINESE JOURNAL OF AERONAUTICS, 2022, 35 (01) : 470 - 484
[45] Deep sparse representation via deep dictionary learning for reinforcement learning
Tang, Jianhao
Li, Zhenni
Xie, Shengli
Ding, Shuxue
Zheng, Shaolong
Chen, Xueni
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2398 - 2403
[46] Predicting phenotypes from novel genomic markers using deep learning
Sehrawat, Shivani
Najafian, Keyhan
Jin, Lingling
BIOINFORMATICS ADVANCES, 2023, 3 (01):
[47] Predicting the structure of unexplored novel fentanyl analogues by deep learning model
Zhang, Yuan
Jiang, Qiaoyan
Li, Ling
Li, Zutan
Xu, Zhihui
Chen, Yuanyuan
Sun, Yang
Liu, Cheng
Mao, Zhengsheng
Chen, Feng
Li, Hualan
Cao, Yue
Pian, Cong
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
[48] Sparse GPU Kernels for Deep Learning
Gale, Trevor
Zaharia, Matei
Young, Cliff
Elsen, Erich
PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
[49] SpineNet-6mA: A Novel Deep Learning Tool for Predicting DNA N6-Methyladenine Sites in Genomes
Abbas, Zeeshan
Tayara, Hilal
Chong, Kil To
IEEE ACCESS, 2020, 8 : 201450 - 201457
[50] Posterior Concentration for Sparse Deep Learning
Polson, Nicholas G.
Rockova, Veronika
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

← 1 2 3 4 5 →