WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection

被引：59

作者：

Zhou, Wujie ^{[1
]}

Sun, Fan ^{[1
,2
]}

Jiang, Qiuping ^{[3
]}

Cong, Runmin ^{[4
]}

Hwang, Jenq-Neng ^{[5
]}

机构：

[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore

[3] Ningbo Univ, Sch Informat Sci & Engn, Ningbo 315211, Peoples R China

[4] Shandong Univ, Sch Control Sci & Engn, Jinan, Peoples R China

[5] Univ Washington, Dept Elect Engn, Seattle, WA 98105 USA

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2023年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Transformers; Feature extraction; Discrete wavelet transforms; Training; Knowledge engineering; Cross layer design; Convolutional neural networks; Wavelet; knowledge distillation; discrete wavelet transform; progressively stretched sine-cosine module; edge-aware module; FUSION; IMAGE;

D O I：

10.1109/TIP.2023.3275538

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, various neural network architectures for computer vision have been devised, such as the visual transformer and multilayer perceptron (MLP). A transformer based on an attention mechanism can outperform a traditional convolutional neural network. Compared with the convolutional neural network and transformer, the MLP introduces less inductive bias and achieves stronger generalization. In addition, a transformer shows an exponential increase in the inference, training, and debugging times. Considering a wave function representation, we propose the WaveNet architecture that adopts a novel vision task-oriented wavelet-based MLP for feature extraction to perform salient object detection in RGB (red-green-blue)-thermal infrared images. In addition, we apply knowledge distillation to a transformer as an advanced teacher network to acquire rich semantic and geometric information and guide WaveNet learning with this information. Following the shortestpath concept, we adopt the Kullback-Leibler distance as a regularization term for the RGB features to be as similar to the thermal infrared features as possible. The discrete wavelet transform allows for the examination of frequency-domain features in a local time domain and time-domain features in a local frequency domain. We apply this representation ability to perform cross-modality feature fusion. Specifically, we introduce a progressively cascaded sine-cosine module for cross-layer feature fusion and use low-level features to obtain clear boundaries of salient objects through the MLP. Results from extensive experiments indicate that the proposed WaveNet achieves impressive performance on benchmark RGB-thermal infrared datasets. The results and code are publicly available at https://github.com/nowander/WaveNet.

引用

页码：3027 / 3039

页数：13

共 50 条

[21] Edge-guided feature fusion network for RGB-T salient object detection
Chen, Yuanlin
Sun, Zengbao
Yan, Cheng
Zhao, Ming
FRONTIERS IN NEUROROBOTICS, 2024, 18
[22] Asymmetric cross-modal activation network for RGB-T salient object detection
Xu, Chang
Li, Qingwu
Zhou, Qingkai
Jiang, Xiongbiao
Yu, Dabing
Zhou, Yaqin
KNOWLEDGE-BASED SYSTEMS, 2022, 258
[23] TSFNet: Two-Stage Fusion Network for RGB-T Salient Object Detection
Guo, Qinling
Zhou, Wujie
Lei, Jingsheng
Yu, Lu
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1655 - 1659
[24] Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection
Zhao, Jianxun
Wen, Xin
He, Yu
Yang, Xiaowei
Song, Kechen
Sensors, 2024, 24 (24)
[25] EAF-Net: an enhancement and aggregation–feedback network for RGB-T salient object detection
Haiyang He
Jing Wang
Xiaolin Li
Minglin Hong
Shiguo Huang
Tao Zhou
Machine Vision and Applications, 2022, 33
[26] Feature differences reduction and specific features preserving network for RGB-T salient object detection
Xu, Qiqi
Di, Zhenguang
Dong, Haoyu
Yang, Gang
IMAGE AND VISION COMPUTING, 2024, 152
[27] Efficient Context-Guided Stacked Refinement Network for RGB-T Salient Object Detection
Huo, Fushuo
Zhu, Xuegui
Zhang, Lei
Liu, Qifeng
Shu, Yu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3111 - 3124
[28] Transformer-based Adaptive Interactive Promotion Network for RGB-T Salient Object Detection
Zhu, Jinchao
Zhang, Xiaoyu
Dong, Feng
Yan, Siyu
Meng, Xianbang
Li, Yuehua
Tan, Panlong
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 1989 - 1994
[29] Does Thermal Really Always Matter for RGB-T Salient Object Detection?
Cong, Runmin
Zhang, Kepu
Zhang, Chen
Zheng, Feng
Zhao, Yao
Huang, Qingming
Kwong, Sam
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6971 - 6982
[30] Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection
Chen, Gang
Shao, Feng
Chai, Xiongli
Chen, Hangwei
Jiang, Qiuping
Meng, Xiangchao
Ho, Yo-Sung
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1787 - 1801

← 1 2 3 4 5 →