End-to-end Image Compression with Swin-Transformer

被引:1
|
作者
Wang, Meng [1 ]
Zhang, Kai [2 ]
Zhang, Li [2 ]
Li, Yue [2 ]
Li, Junru [3 ]
Wang, Yue [3 ]
Wang, Shiqi [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Bytedance Inc, San Diego, CA 92122 USA
[3] Beijing Bytedance Technol Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Image compression; end-to-end compression; transformer; convolution;
D O I
10.1109/VCIP56404.2022.10008895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an end-to-end image compression framework, which cooperates with the swin-transformer modules to capture the localized and non-localized similarities in image compression. In particular, the swin-transformer modules are deployed in the analysis and synthesis stages, interleaving with convolution layers. The transformer layers are expected to perceive more flexible receptive fields, such that the spatially localized and non-localized redundancies could be more effectively eliminated. The proposed method reveals the excellent capability of signal conjunction and prediction, leading to the improvement of the rate and distortion performance. Experimental results show that the proposed method is superior to the existing methods on both natural scene and screen content images, where 22.46% BD-Rate savings are achieved when compared with the BPG. Over 30% BD-Rate gains could be observed with screen content images when compared with the classical hyper-prior end-to-end coding method.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] An end-to-end medical image fusion network based on Swin-transformer
    Yu, Kaixin
    Yang, Xiaoming
    Jeon, Seunggil
    Dou, Qingyu
    MICROPROCESSORS AND MICROSYSTEMS, 2023, 98
  • [2] AN EFFICIENT END-TO-END IMAGE COMPRESSION TRANSFORMER
    Jeny, Afsana Ahsan
    Junayed, Masum Shah
    Islam, Md Baharul
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1786 - 1790
  • [3] An End-to-End Automatic Classification Algorithm for Hyperspectral Images via 3D Convolution and Swin-Transformer
    Guo, Ningbo
    Yang, Lei
    Yue, Cong
    Zhang, Honggang
    Jiang, Mingyong
    Li, Yinan
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 1400 - 1405
  • [4] Graph-Structured Swin-Transformer for Learned Image Compression
    Wang, Lilong
    Shi, Yunhui
    Wang, Jin
    Yin, Baocai
    Ling, Nam
    2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 592 - 592
  • [5] FLSTrack: focused linear attention swin-transformer network with dual-branch decoder for end-to-end multi-object tracking
    Zu, Dafu
    Duan, Xun
    Kong, Guangqian
    Long, Huiyun
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [6] SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer
    Xu, Zhanpeng
    Li, Jianhua
    Yang, Zhaopeng
    Li, Shiliang
    Li, Honglin
    JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)
  • [7] SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer
    Zhanpeng Xu
    Jianhua Li
    Zhaopeng Yang
    Shiliang Li
    Honglin Li
    Journal of Cheminformatics, 14
  • [8] An end-to-end steel surface defect detection approach via Swin transformer
    Tang, Bo
    Song, Zi-Kai
    Sun, Wei
    Wang, Xing-Dong
    IET IMAGE PROCESSING, 2023, 17 (05) : 1334 - 1345
  • [9] End-to-End Optimized ROI Image Compression
    Cai, Chunlei
    Chen, Li
    Zhang, Xiaoyun
    Gao, Zhiyong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 3442 - 3457
  • [10] Efficient end-to-end multispectral image compression
    Depoian, Arthur C., II
    Bailey, Colleen P.
    Guturu, Parthasarathy
    BIG DATA VI: LEARNING, ANALYTICS, AND APPLICATIONS, 2024, 13036