End-to-end Image Compression with Swin-Transformer

被引:1
|
作者
Wang, Meng [1 ]
Zhang, Kai [2 ]
Zhang, Li [2 ]
Li, Yue [2 ]
Li, Junru [3 ]
Wang, Yue [3 ]
Wang, Shiqi [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Bytedance Inc, San Diego, CA 92122 USA
[3] Beijing Bytedance Technol Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Image compression; end-to-end compression; transformer; convolution;
D O I
10.1109/VCIP56404.2022.10008895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an end-to-end image compression framework, which cooperates with the swin-transformer modules to capture the localized and non-localized similarities in image compression. In particular, the swin-transformer modules are deployed in the analysis and synthesis stages, interleaving with convolution layers. The transformer layers are expected to perceive more flexible receptive fields, such that the spatially localized and non-localized redundancies could be more effectively eliminated. The proposed method reveals the excellent capability of signal conjunction and prediction, leading to the improvement of the rate and distortion performance. Experimental results show that the proposed method is superior to the existing methods on both natural scene and screen content images, where 22.46% BD-Rate savings are achieved when compared with the BPG. Over 30% BD-Rate gains could be observed with screen content images when compared with the classical hyper-prior end-to-end coding method.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Estimating the resize parameter in end-to-end learned image compression
    Chen, Li-Heng
    Bampis, Christos G.
    Li, Zhi
    Krasula, Lukas
    Bovik, Alan C.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 135
  • [32] A Reference Resource Based End-to-End Image Compression Scheme
    Yin, Wenbin
    Fan, Xiaopeng
    Shi, Yunhui
    Zuo, Wangmeng
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 534 - 544
  • [33] Transformer-Based End-to-End Anatomical and Functional Image Fusion
    Zhang, Jing
    Liu, Aiping
    Wang, Dan
    Liu, Yu
    Wang, Z. Jane
    Chen, Xun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [34] End-to-End Image Patch Quality Assessment for Image/Video With Compression Artifacts
    Tung Thanh Pham
    Xiem Van Hoang
    Nghia Trung Nguyen
    Duong Trieu Dinh
    Le Thanh Ha
    IEEE ACCESS, 2020, 8 : 215157 - 215172
  • [35] Variable Scale Pruning for Transformer Model Compression in End-to-End Speech Recognition
    Ben Letaifa, Leila
    Rouas, Jean-Luc
    ALGORITHMS, 2023, 16 (09)
  • [36] Compression of End-to-End Models
    Pang, Ruoming
    Sainath, Tara N.
    Prabhavalkar, Rohit
    Gupta, Suyog
    Wu, Yonghui
    Zhang, Shuyuan
    Chiu, Chung-cheng
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 27 - 31
  • [37] Swin-transformer for weak feature matching
    Guo, Yuan
    Li, Wenpeng
    Zhai, Ping
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [38] SwinMFF: toward high-fidelity end-to-end multi-focus image fusion via swin transformer-based network
    Xie, Xinzhe
    Guo, Buyu
    Li, Peiliang
    He, Shuangyan
    Zhou, Sangjun
    VISUAL COMPUTER, 2024, : 3883 - 3906
  • [39] End-to-end optimized image compression with the frequency-oriented transform
    Yuefeng Zhang
    Kai Lin
    Machine Vision and Applications, 2024, 35
  • [40] NN-based Embedment of Watermark in End-to-end Image Compression
    Lee, EunSeong
    Lee, Jongseok
    Seo, Young-Ho
    Sim, Donggyu
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2023, 2023, 12592