A self-supervised deep learning method for data-efficient training in genomics

被引:0
|
作者
Hüseyin Anil Gündüz
Martin Binder
Xiao-Yin To
René Mreches
Bernd Bischl
Alice C. McHardy
Philipp C. Münch
Mina Rezaei
机构
[1] LMU Munich,Department of Statistics
[2] Munich Center for Machine Learning,Department for Computational Biology of Infection Research
[3] Helmholtz Center for Infection Research,Braunschweig Integrated Centre of Systems Biology (BRICS)
[4] Technische Universität Braunschweig,German Center for Infection Research (DZIF)
[5] partner site Hannover Braunschweig,Department of Biostatistics
[6] Harvard School of Public Health,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning in bioinformatics is often limited to problems where extensive amounts of labeled data are available for supervised classification. By exploiting unlabeled data, self-supervised learning techniques can improve the performance of machine learning models in the presence of limited labeled data. Although many self-supervised learning methods have been suggested before, they have failed to exploit the unique characteristics of genomic data. Therefore, we introduce Self-GenomeNet, a self-supervised learning technique that is custom-tailored for genomic data. Self-GenomeNet leverages reverse-complement sequences and effectively learns short- and long-term dependencies by predicting targets of different lengths. Self-GenomeNet performs better than other self-supervised methods in data-scarce genomic tasks and outperforms standard supervised training with ~10 times fewer labeled training data. Furthermore, the learned representations generalize well to new datasets and tasks. These findings suggest that Self-GenomeNet is well suited for large-scale, unlabeled genomic datasets and could substantially improve the performance of genomic models.
引用
收藏
相关论文
共 50 条
  • [41] A Self-Supervised Deep Learning Method for Seismic Data Deblending Using a Blind-Trace Network
    Wang, Shirui
    Hu, Wenyi
    Yuan, Pengyu
    Wu, Xuqing
    Zhang, Qunshan
    Nadukandi, Prashanth
    Botero, German Ocampo
    Chen, Jiefu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3405 - 3414
  • [42] Self-Supervised Deep Learning to Reconstruct Seismic Data With Consecutively Missing Traces
    Huang, He
    Wang, Tengfei
    Cheng, Jiubing
    Xiong, Yineng
    Wang, Chenlong
    Geng, Jianhua
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [43] Progressive self-supervised learning: A pre-training method for crowd counting
    Gu, Yao
    Zheng, Zhe
    Wu, Yingna
    Xie, Guangping
    Ni, Na
    PATTERN RECOGNITION LETTERS, 2025, 188 : 148 - 154
  • [44] Robot Learning by Collaborative Network Training: A Self-Supervised Method using Ranking
    Bretan, Mason
    Oore, Sageev
    Sanan, Siddharth
    Heck, Larry
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1333 - 1340
  • [45] On Pretraining Data Diversity for Self-Supervised Learning
    Hammoud, Hasan Abed Al Kader
    Das, Tuhin
    Pizzati, Fabio
    Torre, Philip H. S.
    Bibi, Adel
    Ghanem, Bernard
    COMPUTER VISION - ECCV 2024, PT LVI, 2025, 15114 : 54 - 71
  • [46] Self-Supervised Learning for Pairwise Data Refinement
    Abrego, Gustavo Hernandez
    Liang, Bowen
    Wang, Wei
    Parekh, Zarana
    Yang, Yinfei
    Sung, Yunhsuan
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 435 - 446
  • [47] Deep self-supervised transformation learning for leukocyte classification
    Chen, Xinwei
    Zheng, Guolin
    Zhou, Liwei
    Li, Zuoyong
    Fan, Haoyi
    JOURNAL OF BIOPHOTONICS, 2023, 16 (03)
  • [48] Self-Supervised Synthesis Ranking for Deep Metric Learning
    Fu, Zheren
    Mao, Zhendong
    Yan, Chenggang
    Liu, An-An
    Xie, Hongtao
    Zhang, Yongdong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4736 - 4750
  • [49] SELF-SUPERVISED DEEP LEARNING FOR FISHEYE IMAGE RECTIFICATION
    Chao, Chun-Hao
    Hsu, Pin-Lun
    Lee, Hung-Yi
    Wang, Yu-Chiang Frank
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2248 - 2252
  • [50] Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation
    Cheng, Ruizhe
    Wu, Bichen
    Zhang, Peizhao
    Vajda, Peter
    Gonzalez, Joseph E.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3113 - 3118