A self-supervised deep learning method for data-efficient training in genomics

被引:0
|
作者
Hüseyin Anil Gündüz
Martin Binder
Xiao-Yin To
René Mreches
Bernd Bischl
Alice C. McHardy
Philipp C. Münch
Mina Rezaei
机构
[1] LMU Munich,Department of Statistics
[2] Munich Center for Machine Learning,Department for Computational Biology of Infection Research
[3] Helmholtz Center for Infection Research,Braunschweig Integrated Centre of Systems Biology (BRICS)
[4] Technische Universität Braunschweig,German Center for Infection Research (DZIF)
[5] partner site Hannover Braunschweig,Department of Biostatistics
[6] Harvard School of Public Health,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning in bioinformatics is often limited to problems where extensive amounts of labeled data are available for supervised classification. By exploiting unlabeled data, self-supervised learning techniques can improve the performance of machine learning models in the presence of limited labeled data. Although many self-supervised learning methods have been suggested before, they have failed to exploit the unique characteristics of genomic data. Therefore, we introduce Self-GenomeNet, a self-supervised learning technique that is custom-tailored for genomic data. Self-GenomeNet leverages reverse-complement sequences and effectively learns short- and long-term dependencies by predicting targets of different lengths. Self-GenomeNet performs better than other self-supervised methods in data-scarce genomic tasks and outperforms standard supervised training with ~10 times fewer labeled training data. Furthermore, the learned representations generalize well to new datasets and tasks. These findings suggest that Self-GenomeNet is well suited for large-scale, unlabeled genomic datasets and could substantially improve the performance of genomic models.
引用
收藏
相关论文
共 50 条
  • [21] Efficient Self-Supervised Data Collection for Offline Robot Learning
    Endrawis, Shadi
    Leibovich, Gal
    Jacob, Guy
    Novik, Gal
    Tamar, Aviv
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4650 - 4656
  • [22] Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks
    Tang, Mufeng
    Yang, Yibo
    Amit, Yali
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16
  • [23] A Data-Efficient Method of Deep Reinforcement Learning for Chinese Chess
    Xu, Changming
    Ding, Hengfeng
    Zhang, Xuejian
    Wang, Cong
    Yang, Hongji
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY COMPANION, QRS-C, 2022, : 687 - 693
  • [24] A New Self-supervised Method for Supervised Learning
    Yang, Yuhang
    Ding, Zilin
    Cheng, Xuan
    Wang, Xiaomin
    Liu, Ming
    INTERNATIONAL CONFERENCE ON COMPUTER VISION, APPLICATION, AND DESIGN (CVAD 2021), 2021, 12155
  • [25] A novel collaborative self-supervised learning method for radiomic data
    Li, Zhiyuan
    Li, Hailong
    Ralescu, Anca L.
    Dillman, Jonathan R.
    Parikh, Nehal A.
    He, Lili
    NEUROIMAGE, 2023, 277
  • [26] Self-Adaptive Training: Bridging Supervised and Self-Supervised Learning
    Huang, Lang
    Zhang, Chao
    Zhang, Hongyang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1362 - 1377
  • [27] Deep echocardiography: data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease
    Ali Madani
    Jia Rui Ong
    Anshul Tibrewal
    Mohammad R. K. Mofrad
    npj Digital Medicine, 1
  • [28] Deep echocardiography: data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease
    Madani, Ali
    Ong, Jia Rui
    Tibrewal, Anshul
    Mofrad, Mohammad R. K.
    NPJ DIGITAL MEDICINE, 2018, 1
  • [29] Seismic Data Denoising Using a Self-Supervised Deep Learning Network
    Wang, Detao
    Chen, Guoxiong
    Chen, Jianwei
    Cheng, Qiuming
    MATHEMATICAL GEOSCIENCES, 2024, 56 (03) : 487 - 510
  • [30] Self-Supervised Deep Learning Framework for Anomaly Detection in Traffic Data
    Morris, Clint
    Yang, Jidong J.
    Chorzepa, Mi Geum
    Kim, S. Sonny
    Durham, Stephan A.
    JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2022, 148 (05)