Does Thermal Really Always Matter for RGB-T Salient Object Detection?

被引：46

作者：

Cong, Runmin ^{[1
,2
,3
]}

Zhang, Kepu ^{[1
,2
]}

Zhang, Chen ^{[1
,2
]}

Zheng, Feng ^{[4
,5
]}

Zhao, Yao ^{[1
,2
]}

Huang, Qingming ^{[6
,7
,8
]}

Kwong, Sam ^{[3
,9
]}

机构：

[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China

[2] Network Technol, Beijing Key Lab Adv Informat Sci, Beijing 100044, Peoples R China

[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

[4] Southern Univ Sci & Technol, Dept Comp Sci & Technol, Shenzhen 518055, Peoples R China

[5] Res Inst Trustworthy Autonomous Syst, Shenzhen 518055, Peoples R China

[6] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China

[7] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China

[8] Peng Cheng Lab, Shenzhen 518055, Peoples R China

[9] City Univ Hong Kong, Shenzhen Res Inst, Shenzhen 51800, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

北京市自然科学基金; 国家重点研发计划; 中国国家自然科学基金;

关键词：

Task analysis; Decoding; Semantics; Object detection; Location awareness; Lighting; Feature extraction; RGB-T images; salient object detection; global illumination estimation; semantic constraint provider; localization and complementation; FUSION NETWORK;

D O I：

10.1109/TMM.2022.3216476

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, RGB-T salient object detection (SOD) has attracted continuous attention, which makes it possible to identify salient objects in environments such as low light by introducing thermal image. However, most of the existing RGB-T SOD models focus on how to perform cross-modality feature fusion, ignoring whether thermal image is really always matter in SOD task. Starting from the definition and nature of this task, this paper rethinks the connotation of thermal modality, and proposes a network named TNet to solve the RGB-T SOD task. In this paper, we introduce a global illumination estimation module to predict the global illuminance score of the image, so as to regulate the role played by the two modalities. In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase. On the one hand, we introduce a semantic constraint provider to enrich the semantics of thermal images in the encoding phase, which makes thermal modality more suitable for the SOD task. On the other hand, we introduce a two-stage localization and complementation module in the decoding phase to transfer object localization cue and internal integrity cue in thermal features to the RGB modality. Extensive experiments on three datasets show that the proposed TNet achieves competitive performance compared with 20 state-of-the-art methods.

引用

页码：6971 / 6982

页数：12

共 50 条

[31] EAF-Net: an enhancement and aggregation–feedback network for RGB-T salient object detection
Haiyang He
Jing Wang
Xiaolin Li
Minglin Hong
Shiguo Huang
Tao Zhou
Machine Vision and Applications, 2022, 33
[32] RGB-T salient object detection via CNN feature and result saliency map fusion
Chang Xu
Qingwu Li
Mingyu Zhou
Qingkai Zhou
Yaqin Zhou
Yunpeng Ma
Applied Intelligence, 2022, 52 : 11343 - 11362
[33] RGB-T Salient Object Detection via Fusing Multi-Level CNN Features
Zhang, Qiang
Huang, Nianchang
Yao, Lin
Zhang, Dingwen
Shan, Caifeng
Han, Jungong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 3321 - 3335
[34] RGB-T salient object detection via CNN feature and result saliency map fusion
Xu, Chang
Li, Qingwu
Zhou, Mingyu
Zhou, Qingkai
Zhou, Yaqin
Ma, Yunpeng
APPLIED INTELLIGENCE, 2022, 52 (10) : 11343 - 11362
[35] Feature differences reduction and specific features preserving network for RGB-T salient object detection
Xu, Qiqi
Di, Zhenguang
Dong, Haoyu
Yang, Gang
IMAGE AND VISION COMPUTING, 2024, 152
[36] Efficient Context-Guided Stacked Refinement Network for RGB-T Salient Object Detection
Huo, Fushuo
Zhu, Xuegui
Zhang, Lei
Liu, Qifeng
Shu, Yu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3111 - 3124
[37] Transformer-based Adaptive Interactive Promotion Network for RGB-T Salient Object Detection
Zhu, Jinchao
Zhang, Xiaoyu
Dong, Feng
Yan, Siyu
Meng, Xianbang
Li, Yuehua
Tan, Panlong
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 1989 - 1994
[38] Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection
Chen, Gang
Shao, Feng
Chai, Xiongli
Chen, Hangwei
Jiang, Qiuping
Meng, Xiangchao
Ho, Yo-Sung
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1787 - 1801
[39] UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection
Gao, Lina
Fu, Ping
Xu, Mingzhu
Wang, Tiantian
Liu, Bing
VISUAL COMPUTER, 2024, 40 (03): : 1565 - 1582
[40] Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection
Gao, Wei
Liao, Guibiao
Ma, Siwei
Li, Ge
Liang, Yongsheng
Lin, Weisi
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2091 - 2106

← 1 2 3 4 5 →