DeepTFactor: A deep learning-based tool for the prediction of transcription factors

被引:65
|
作者
Kim, Gi Bae [1 ,2 ,3 ,4 ,5 ,6 ]
Gao, Ye [7 ,8 ,9 ]
Palsson, Bernhard O. [8 ,9 ,10 ]
Lee, Sang Yup [1 ,2 ,3 ,4 ,5 ,6 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Chem & Biomol Engn, Metab & Biomol Engn Natl Res Lab, BK21 Plus Program, Daejeon 34141, South Korea
[2] Korea Adv Inst Sci & Technol, Syst Metab Engn & Syst Healthcare Cross Generat C, Daejeon 34141, South Korea
[3] Korea Adv Inst Sci & Technol, KAIST Inst BioCentury, Daejeon 34141, South Korea
[4] Korea Adv Inst Sci & Technol, KAIST Inst Artificial Intelligence, Daejeon 34141, South Korea
[5] Korea Adv Inst Sci & Technol, BioProc Engn Res Ctr, Daejeon 34141, South Korea
[6] Korea Adv Inst Sci & Technol, BioInformat Res Ctr, Daejeon 34141, South Korea
[7] Univ Calif San Diego, Div Biol Sci, La Jolla, CA 92093 USA
[8] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[9] Univ Calif San Diego, Bioinformat & Syst Biol Program, La Jolla, CA 92093 USA
[10] Novo Nordisk Fdn Ctr Biosustainabil, DK-2800 Lyngby, Denmark
基金
新加坡国家研究基金会;
关键词
ChIP-exo; deep learning; transcription factor; transcription regulation; y-ome; ESCHERICHIA-COLI; PROTEIN; MODELS;
D O I
10.1073/pnas.2021171118
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A transcription factor (TF) is a sequence-specific DNA-binding protein that modulates the transcription of a set of particular genes, and thus regulates gene expression in the cell. TFs have commonly been predicted by analyzing sequence homology with the DNA-binding domains of TFs already characterized. Thus, TFs that do not show homologies with the reported ones are difficult to predict. Here we report the development of a deep learning-based tool, DeepTFactor, that predicts whether a protein in question is a TF. DeepTFactor uses a convolutional neural network to extract features of a protein. It showed high performance in predicting TFs of both eukaryotic and prokaryotic origins, resulting in F1 scores of 0.8154 and 0.8000, respectively. Analysis of the gradients of prediction score with respect to input suggested that DeepTFactor detects DNA-binding domains and other latent features for TF prediction. DeepTFactor predicted 332 candidate TFs in Escherichia coli K-12 MG1655. Among them, 84 candidate TFs belong to the y-ome, which is a collection of genes that lack experimental evidence of function. We experimentally validated the results of DeepTFactor prediction by further characterizing genome-wide binding sites of three predicted TFs, YqhC, YiaU, and YahB. Furthermore, we made available the list of 4,674,808 TFs predicted from 73,873,012 protein sequences in 48,346 genomes. DeepTFactor will serve as a useful tool for predicting TFs, which is necessary for understanding the regulatory systems of organisms of interest. We provide Deep-TFactor as a stand-alone program, available at https://bitbucket.org/kaistsystemsbiology/deeptfactor.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Deep Learning-Based Defect Prediction for Mobile Applications
    Jorayeva, Manzura
    Akbulut, Akhan
    Catal, Cagatay
    Mishra, Alok
    SENSORS, 2022, 22 (13)
  • [32] Deep learning-based superconductivity prediction and experimental tests
    Kaplan, Daniel
    Zheng, Adam
    Blawat, Joanna
    Jin, Rongying
    Cava, Robert J.
    Oudovenko, Viktor
    Kotliar, Gabriel
    Sengupta, Anirvan M.
    Xie, Weiwei
    EUROPEAN PHYSICAL JOURNAL PLUS, 2025, 140 (01):
  • [33] A deep learning-based framework for road traffic prediction
    Redouane Benabdallah Benarmas
    Kadda Beghdad Bey
    The Journal of Supercomputing, 2024, 80 : 6891 - 6916
  • [34] Research on Deep Learning-Based Financial Risk Prediction
    Huang, Boning
    Wei, Junkang
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [35] Deep Learning-Based Destination Prediction Scheme by Trajectory Prediction Framework
    Yang, Jingkang
    Cao, Jianyu
    Liu, Yining
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [36] Optimized deep learning-based prediction model for chiller performance prediction
    Sathesh, Tamilarasan
    Shih, Yang-Cheng
    DATA & KNOWLEDGE ENGINEERING, 2023, 144
  • [37] BioTMPy: A Deep Learning-Based Tool to Classify Biomedical Literature
    Alves, Nuno
    Rodrigues, Ruben
    Rocha, Miguel
    PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS, PACBB 2021, 2022, 325 : 115 - 125
  • [38] Development of a deep learning-based tool to assist wound classification
    Huang, Po -Hsuan
    Pan, Yi -Hsiang
    Luo, Ying -Sheng
    Chen, Yi -Fan
    Lo, Yu -Cheng
    Chen, Trista Pei -Chun
    Perng, Cherng -Kang
    JOURNAL OF PLASTIC RECONSTRUCTIVE AND AESTHETIC SURGERY, 2023, 79 : 89 - 97
  • [39] Deep Learning-Based Short-Term Wind Power Prediction Considering Various Factors
    Qian, Zhonghao
    Wen, Shuli
    Zhang, Liudong
    Zhang, Jun
    Yuan, Song
    Mao, Lei
    Zhou, Liang
    2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 529 - 533
  • [40] A machine learning-based universal outbreak risk prediction tool
    Zhang, Tianyu
    Rabhi, Fethi
    Chen, Xin
    Paik, Hye-young
    Macintyre, Chandini Raina
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169