End-to-end Image Compression with Swin-Transformer

被引：1

作者：

Wang, Meng ^{[1
]}

Zhang, Kai ^{[2
]}

Zhang, Li ^{[2
]}

Li, Yue ^{[2
]}

Li, Junru ^{[3
]}

Wang, Yue ^{[3
]}

Wang, Shiqi ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

[2] Bytedance Inc, San Diego, CA 92122 USA

[3] Beijing Bytedance Technol Co Ltd, Beijing, Peoples R China

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP) | 2022年

基金：

中国国家自然科学基金;

关键词：

Image compression; end-to-end compression; transformer; convolution;

D O I：

10.1109/VCIP56404.2022.10008895

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose an end-to-end image compression framework, which cooperates with the swin-transformer modules to capture the localized and non-localized similarities in image compression. In particular, the swin-transformer modules are deployed in the analysis and synthesis stages, interleaving with convolution layers. The transformer layers are expected to perceive more flexible receptive fields, such that the spatially localized and non-localized redundancies could be more effectively eliminated. The proposed method reveals the excellent capability of signal conjunction and prediction, leading to the improvement of the rate and distortion performance. Experimental results show that the proposed method is superior to the existing methods on both natural scene and screen content images, where 22.46% BD-Rate savings are achieved when compared with the BPG. Over 30% BD-Rate gains could be observed with screen content images when compared with the classical hyper-prior end-to-end coding method.

引用

页数：5

共 50 条

[31] Estimating the resize parameter in end-to-end learned image compression
Chen, Li-Heng
Bampis, Christos G.
Li, Zhi
Krasula, Lukas
Bovik, Alan C.
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 135
[32] A Reference Resource Based End-to-End Image Compression Scheme
Yin, Wenbin
Fan, Xiaopeng
Shi, Yunhui
Zuo, Wangmeng
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 534 - 544
[33] Transformer-Based End-to-End Anatomical and Functional Image Fusion
Zhang, Jing
Liu, Aiping
Wang, Dan
Liu, Yu
Wang, Z. Jane
Chen, Xun
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[34] End-to-End Image Patch Quality Assessment for Image/Video With Compression Artifacts
Tung Thanh Pham
Xiem Van Hoang
Nghia Trung Nguyen
Duong Trieu Dinh
Le Thanh Ha
IEEE ACCESS, 2020, 8 : 215157 - 215172
[35] Variable Scale Pruning for Transformer Model Compression in End-to-End Speech Recognition
Ben Letaifa, Leila
Rouas, Jean-Luc
ALGORITHMS, 2023, 16 (09)
[36] Compression of End-to-End Models
Pang, Ruoming
Sainath, Tara N.
Prabhavalkar, Rohit
Gupta, Suyog
Wu, Yonghui
Zhang, Shuyuan
Chiu, Chung-cheng
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 27 - 31
[37] Swin-transformer for weak feature matching
Guo, Yuan
Li, Wenpeng
Zhai, Ping
SCIENTIFIC REPORTS, 2025, 15 (01):
[38] SwinMFF: toward high-fidelity end-to-end multi-focus image fusion via swin transformer-based network
Xie, Xinzhe
Guo, Buyu
Li, Peiliang
He, Shuangyan
Zhou, Sangjun
VISUAL COMPUTER, 2024, : 3883 - 3906
[39] End-to-end optimized image compression with the frequency-oriented transform
Yuefeng Zhang
Kai Lin
Machine Vision and Applications, 2024, 35
[40] NN-based Embedment of Watermark in End-to-end Image Compression
Lee, EunSeong
Lee, Jongseok
Seo, Young-Ho
Sim, Donggyu
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2023, 2023, 12592

← 1 2 3 4 5 →