Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection

被引：4

作者：

Zhao, Zhengyun ^{[1
]}

Huang, Ziqing ^{[1
]}

Chai, Xiuli ^{[1
]}

Wang, Jun ^{[1
]}

机构：

[1] Henan Univ, Sch Artificial Intelligence, Zhengzhou 450046, Peoples R China

来源：

NEURAL PROCESSING LETTERS | 2023年 / 55卷 / 01期

基金：

中国国家自然科学基金;

关键词：

RGB-D salient object detection; Convolutional neural network; Cross-modal fusion; Depth modal enhancement; FUSION; CONSISTENT; IMAGE;

D O I：

10.1007/s11063-022-10886-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep modal can provide supplementary features for RGB images, which deeply improves the performance of salient object detection (SOD). However, depth images are disturbed by external factors during the acquisition process, resulting in low-quality acquisitions. Moreover, there are differences between the RGB and depth modals, so simply fusing the two modals cannot fully complement the depth information into the RGB modal. To enhance the quality of the depth image and integrate the cross-modal information effectively, we propose a depth enhanced cross-modal cascaded network (DCCNet) for RGB-D SOD. The entire cascaded network includes a depth cascaded branch, a RGB cascaded branch and a cross-modal fusion strategy. In the depth cascaded branch, we design a depth preprocessing algorithm to enhance the quality of the depth image. And in the process of depth feature extraction, we adopt four cascaded cross-modal guided modules to guide the RGB feature extraction process. In the RGB cascaded branch, we design five cascaded residual adaptive selection modules to output the RGB image feature extraction in each stage. In the cross-modal fusion strategy, a cross-modal channel-wise refinement is adopted to fuse the top-level features of the different modal feature branches. Finally, the multiscale loss is adopted to optimize the network training. Experimental results on six common RGB-D SOD datasets show that the performance of the proposed DCCNet is comparable to that of the state-of-the-art RGB-D SOD methods.

引用

页码：361 / 384

页数：24

共 50 条

[31] Cross-Modal Attentional Context Learning for RGB-D Object Detection
Li, Guanbin
Gan, Yukang
Wu, Hejun
Xiao, Nong
Lin, Liang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1591 - 1601
[32] Cross-Modal Adaptation for RGB-D Detection
Hoffman, Judy
Gupta, Saurabh
Leong, Jian
Guadarrama, Sergio
Darrell, Trevor
2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 5032 - 5039
[33] Depth-aware lightweight network for RGB-D salient object detection
Ling, Liuyi
Wang, Yiwen
Wang, Chengjun
Xu, Shanyong
Huang, Yourui
IET IMAGE PROCESSING, 2023, 17 (08) : 2350 - 2361
[34] Depth cue enhancement and guidance network for RGB-D salient object detection
Li, Xiang
Zhang, Qing
Yan, Weiqi
Dai, Meng
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
[35] DMGNet: Depth mask guiding network for RGB-D salient object detection
Tang, Yinggan
Li, Mengyao
NEURAL NETWORKS, 2024, 180
[36] Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection
Du, Qinsheng
Bian, Yingxu
Wu, Jianyu
Zhang, Shiyan
Zhao, Jian
APPLIED SCIENCES-BASEL, 2024, 14 (17):
[37] Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection
Liu, Di
Zhang, Kao
Chen, Zhenzhong
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 967 - 981
[38] A cascaded refined rgb-d salient object detection network based on the attention mechanism
Zong, Guanyu
Wei, Longsheng
Guo, Siyuan
Wang, Yongtao
APPLIED INTELLIGENCE, 2023, 53 (11) : 13527 - 13548
[39] A cascaded refined rgb-d salient object detection network based on the attention mechanism
Guanyu Zong
Longsheng Wei
Siyuan Guo
Yongtao Wang
Applied Intelligence, 2023, 53 : 13527 - 13548
[40] Asymmetric cross-modal activation network for RGB-T salient object detection
Xu, Chang
Li, Qingwu
Zhou, Qingkai
Jiang, Xiongbiao
Yu, Dabing
Zhou, Yaqin
KNOWLEDGE-BASED SYSTEMS, 2022, 258

← 1 2 3 4 5 →