LEC-Codec: Learning-Based Genome Data Compression

被引:0
|
作者
Sun, Zhenhao [1 ]
Wang, Meng [2 ]
Wang, Shiqi [1 ]
Kwong, Sam [2 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Lingnan Univ, Sch Data Sci, Hong Kong, Peoples R China
关键词
Genomics; Bioinformatics; Encoding; Context modeling; Symbols; Predictive models; Codecs; Computational modeling; Complexity theory; Termination of employment; Data compression; learning-based method; lossless genome compression; non-reference method;
D O I
10.1109/TCBB.2024.3473899
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this paper, we propose a Learning-based gEnome Codec (LEC), which is designed for high efficiency and enhanced flexibility. The LEC integrates several advanced technologies, including Group of Bases (GoB) compression, multi-stride coding and bidirectional prediction, all of which are aimed at optimizing the balance between coding complexity and performance in lossless compression. The model applied in our proposed codec is data-driven, based on deep neural networks to infer probabilities for each symbol, enabling fully parallel encoding and decoding with configured complexity for diverse applications. Based upon a set of configurations on compression ratios and inference speed, experimental results show that the proposed method is very efficient in terms of compression performance and provides improved flexibility in real-world applications.
引用
收藏
页码:2447 / 2458
页数:12
相关论文
共 50 条
  • [21] Learning-based multiresolution transforms with application to image compression
    Arandiga, Francesc
    Cohen, Albert
    Yanez, Dionisio F.
    SIGNAL PROCESSING, 2013, 93 (09) : 2474 - 2484
  • [22] Rate-constrained learning-based image compression
    Guerin Jr, Nilson D.
    da Silva, Renam Castro
    de Oliveira, Matheus C.
    Jung, Henrique
    Martins, Luiz Gustavo R.
    Peixoto, Eduardo
    Macchiavello, Bruno
    Hung, Edson M.
    Testoni, Vanessa
    Freitas, Pedro Garcia
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 101
  • [23] LSVC: A Learning-based Stereo Video Compression Framework
    Chen, Zhenghao
    Lu, Guo
    Hu, Zhihao
    Liu, Shan
    Jiang, Wei
    Xu, Dong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 6063 - 6072
  • [24] Learning-Based Data Transmissions for Future 6G Enabled Industrial IoT: A Data Compression Perspective
    Zhang, Mingqiang
    Zhang, Haixia
    Fang, Yuguang
    Yuan, Dongfeng
    IEEE NETWORK, 2022, 36 (05): : 180 - 187
  • [25] ELFIC: A Learning-based Flexible Image Codec with Rate-Distortion-Complexity Optimization
    Zhang, Zhichen
    Chen, Bolin
    Lin, Hongbin
    Lin, Jielian
    Wang, Xu
    Zhao, Tiesong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9252 - 9261
  • [26] Learning-based Fusion for Data Deduplication
    Dinerstein, Jared
    Dinerstein, Sabra
    Egbert, Parris K.
    Clyde, Stephen W.
    SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 66 - +
  • [27] Deep Learning-based Image Compression with Trellis Coded Quantization
    Li, Binglin
    Akbari, Mohammad
    Liang, Jie
    Wang, Yang
    2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 13 - 22
  • [28] A Comparison of Machine Learning-Based and Conventional Technologies for Video Compression
    Mochurad, Lesia
    TECHNOLOGIES, 2024, 12 (04)
  • [29] Learning-based short text compression using BERT models
    Ozturk, Emir
    Mesut, Altan
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [30] Impact of image compression on deep learning-based mammogram classification
    Yong-Yeon Jo
    Young Sang Choi
    Hyun Woo Park
    Jae Hyeok Lee
    Hyojung Jung
    Hyo-Eun Kim
    Kyounglan Ko
    Chan Wha Lee
    Hyo Soung Cha
    Yul Hwangbo
    Scientific Reports, 11