SSIR: Spatial shuffle multi-head self-attention for Single Image Super-Resolution

被引：14

作者：

Zhao, Liangliang ^{[1
,2
]}

Gao, Junyu ^{[1
,2
,3
]}

Deng, Donghu ^{[1
,2
]}

Li, Xuelong ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Sch Artificial Intelligence OPt & Elect iOPEN, Xian 710072, Shaanxi, Peoples R China

[2] Minist Ind & Informat Technol, Key Lab Intelligent Interact & Applicat, Xian 710072, Shaanxi, Peoples R China

[3] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China

来源：

PATTERN RECOGNITION | 2024年 / 148卷

关键词：

Single Image Super-Resolution; Long-range attention; Vision transformer;

D O I：

10.1016/j.patcog.2023.110195

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Benefiting from the development of deep convolutional neural networks, CNN-based single-image super-resolution methods have achieved remarkable reconstruction results. However, the limited perceptual field of the convolutional kernel and the use of static weights in the inference process limit the performance of CNN-based methods. Recently, a few vision transformer-based image super-resolution methods have achieved excellent performance compared to CNN-based methods. These methods contain many parameters and require vast amounts of GPU memory for training. In this paper, we propose a spatial shuffle multi-head self-attention for single-image super-resolution that can significantly model long-range pixel dependencies without additional computational consumption. A local perception module is also proposed to combine convolutional neural networks' local connectivity and translational invariance. Reconstruction results on five popular benchmarks show that the proposed method outperforms existing methods in both reconstruction accuracy and visual performance. The proposed method matches the performance of transformed-based methods but requires an inferior number of transformer blocks, which reduces the number of parameters by 40%, GPU memory by 30%, and inference time by 30% compared to transformer-based methods.

引用

页数：12

共 50 条

[41] RDNet: Lightweight Residual and Detail self-attention Network for infrared image super-resolution
Chen, Feiyang
Huang, Detian
Lin, Mingxin
Song, Jiaxun
Huang, Xiaoqian
INFRARED PHYSICS & TECHNOLOGY, 2024, 141
[42] Self-attention negative feedback network for real-time image super-resolution
Liu, Xiangbin
Chen, Shuqi
Song, Liping
Wozniak, Marcin
Liu, Shuai
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6179 - 6186
[43] Degradation-Aware Self-Attention Based Transformer for Blind Image Super-Resolution
Liu, Qingguo
Gao, Pan
Han, Kang
Liu, Ningzhong
Xiang, Wei
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7516 - 7528
[44] DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution
Li, Xiang
Dong, Jiangxin
Tang, Jinhui
Pan, Jinshan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12746 - 12755
[45] Multi-attention augmented network for single image super-resolution
Chen, Rui
Zhang, Heng
Liu, Jixin
PATTERN RECOGNITION, 2022, 122
[46] Multi-Grained Attention Networks for Single Image Super-Resolution
Wu, Huapeng
Zou, Zhengxia
Gui, Jie
Zeng, Wen-Jun
Ye, Jieping
Zhang, Jun
Liu, Hongyi
Wei, Zhihui
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) : 512 - 522
[47] Image super-resolution via channel attention and spatial attention
Lu, Enmin
Hu, Xiaoxiao
APPLIED INTELLIGENCE, 2022, 52 (02) : 2260 - 2268
[48] Image super-resolution via channel attention and spatial attention
Enmin Lu
Xiaoxiao Hu
Applied Intelligence, 2022, 52 : 2260 - 2268
[49] A spatial-spectral fusion convolutional transformer network with contextual multi-head self-attention for hyperspectral image classification
Wang, Wuli
Sun, Qi
Zhang, Li
Ren, Peng
Wang, Jianbu
Ren, Guangbo
Liu, Baodi
NEURAL NETWORKS, 2025, 187
[50] Attention as Relation: Learning Supervised Multi-head Self-Attention for Relation Extraction
Liu, Jie
Chen, Shaowei
Wang, Bingquan
Zhang, Jiaxin
Li, Na
Xu, Tong
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3787 - 3793

← 1 2 3 4 5 →