DeepTSS: multi-branch convolutional neural network for transcription start site identification from CAGE data

被引：0

作者：

Grigoriadis, Dimitris ^{[1
,2
]}

Perdikopanis, Nikos ^{[1
,3
,4
]}

Georgakilas, Georgios K. ^{[4
,5
]}

Hatzigeorgiou, Artemis G. ^{[1
,2
]}

机构：

[1] Hellenic Pasteur Institute, Athens,11521, Greece

[2] Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia,35131, Greece

[3] Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens,15784, Greece

[4] Department of Electrical and Computer Engineering, University of Thessaly, Volos,38221, Greece

[5] ommAI Technologies, Tallinn, Estonia

来源：

BMC Bioinformatics | 2022年 / 23卷

关键词：

Bioinformatics - Computational methods - Convolution - Convolutional neural networks - Deep learning - Learning systems - Molecular biology - Proteins - Signal processing - Signal to noise ratio;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Background: The widespread usage of Cap Analysis of Gene Expression (CAGE) has led to numerous breakthroughs in understanding the transcription mechanisms. Recent evidence in the literature, however, suggests that CAGE suffers from transcriptional and technical noise. Regardless of the sample quality, there is a significant number of CAGE peaks that are not associated with transcription initiation events. This type of signal is typically attributed to technical noise and more frequently to random five-prime capping or transcription bioproducts. Thus, the need for computational methods emerges, that can accurately increase the signal-to-noise ratio in CAGE data, resulting in error-free transcription start site (TSS) annotation and quantification of regulatory region usage. In this study, we present DeepTSS, a novel computational method for processing CAGE samples, that combines genomic signal processing (GSP), structural DNA features, evolutionary conservation evidence and raw DNA sequence with Deep Learning (DL) to provide single-nucleotide TSS predictions with unprecedented levels of performance. Results: To evaluate DeepTSS, we utilized experimental data, protein-coding gene annotations and computationally-derived genome segmentations by chromatin states. DeepTSS was found to outperform existing algorithms on all benchmarks, achieving 98% precision and 96% sensitivity (accuracy 95.4%) on the protein-coding gene strategy, with 96.66% of its positive predictions overlapping active chromatin, 98.27% and 92.04% co-localized with at least one transcription factor and H3K4me3 peak. Conclusions: CAGE is a key protocol in deciphering the language of transcription, however, as every experimental protocol, it suffers from biological and technical noise that can severely affect downstream analyses. DeepTSS is a novel DL-based method for effectively removing noisy CAGE signal. In contrast to existing software, DeepTSS does not require feature selection since the embedded convolutional layers can readily identify patterns and only utilize the important ones for the classification task. This study highlights the key role that DL can play in Molecular Biology, by removing the inherent flaws of experimental protocols, that form the backbone of contemporary research. Here, we show how DeepTSS can unleash the full potential of an already popular and mature method such as CAGE, and push the boundaries of coding and non-coding gene expression regulator research even further. © 2022, The Author(s).

引用

共 50 条

[31] Scene Classification via Learning a Multi-Branch Convolutional Network
Bian, Xiaoyong
Chen, Chunfang
Chen, Yang
Fei, Xiongjun
Tang, Jingshan
2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 2555 - 2559
[32] Multi-branch convolutional neural network for built-up area extraction from remote sensing image
Tan, Yihua
Xiong, Shengzhou
Yan, Pei
NEUROCOMPUTING, 2020, 396 (396) : 358 - 374
[33] Multi-Branch Convolutional Network for Context-Aware Recommendation
Guo, Wei
Zhang, Can
Guo, Huifeng
Tang, Ruiming
He, Xiuqiang
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1709 - 1712
[34] Multi-branch neural network for hybrid metrology improvement
Digraci, P.
Besacier, M.
Gergaud, P.
Rademaker, G.
Reche, J.
METROLOGY, INSPECTION, AND PROCESS CONTROL XXXVI, 2022, 12053
[35] Low-Light Image Enhancement Based on Multi-Branch All Convolutional Neural Network
Wu Ruoyou
Wang Dexing
Yuan Hongchun
Peng, Gong
Chen Guanqi
Dan, Wang
LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (14)
[36] Multi-Branch Sustainable Convolutional Neural Network for Disease Classification (vol 33, pg 1621, 2023)
Naz, Maria
Shah, Munam Ali
Khattak, Hasan Ali
Wahid, Abdul
Asghar, Muhammad Nabeel
Rauf, Hafiz Tayyab
Khan, Muhammad Attique
Ameer, Zoobia
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (05)
[37] Multi-scale feature pyramid and multi-branch neural network for person re-identification
Pengfei Wang
Minglian Wang
Dongzhi He
The Visual Computer, 2023, 39 : 5185 - 5197
[38] MULTI-BRANCH DEFORMABLE CONVOLUTIONAL NEURAL NETWORK WITH LABEL DISTRIBUTION LEARNING FOR FETAL BRAIN AGE PREDICTION
Liao, Lufan
Zhang, Xin
Zhao, Fenqiang
Lou, Jingjiao
Wang, Li
Xu, Xiangmin
Zhang, He
Li, Gang
2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 424 - 427
[39] A multi-branch deep neural network model for failure prognostics based on multimodal data
Yang, Zhe
Baraldi, Piero
Zio, Enrico
JOURNAL OF MANUFACTURING SYSTEMS, 2021, 59 : 42 - 50
[40] A New Multi-Branch Convolutional Neural Network and Feature Map Extraction Method for Traffic Congestion Detection
Jiang, Shan
Feng, Yuming
Zhang, Wei
Liao, Xiaofeng
Dai, Xiangguang
Onasanya, Babatunde Oluwaseun
SENSORS, 2024, 24 (13)

← 1 2 3 4 5 →