DMRA: Depth-Induced Multi-Scale Recurrent Attention Network for RGB-D Saliency Detection

被引:48
|
作者
Ji, Wei [1 ,2 ]
Yan, Ge [2 ]
Li, Jingjing [1 ,2 ]
Piao, Yongri [3 ]
Yao, Shunyu [2 ]
Zhang, Miao [4 ]
Cheng, Li [1 ]
Lu, Huchuan [3 ]
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T5V 1A4, Canada
[2] Dalian Univ Technol, Sch Software Technol, Dalian 116024, Peoples R China
[3] Dalian Univ Technol, Sch Informat & Commun Engn, Fac Elect Informat & Elect Engn, Dalian 116024, Peoples R China
[4] Dalian Univ Technol, DUT RU Int Sch Informat & Software Engn, Key Lab Ubiquitous Network & Serv Software Liaoni, Dalian 116024, Peoples R China
基金
中国国家自然科学基金; 加拿大自然科学与工程研究理事会;
关键词
Feature extraction; Saliency detection; Semantics; Random access memory; Cameras; Analytical models; Visualization; RGB-D saliency detection; salient object detection; convolutional neural networks; cross-modal fusion; OBJECT DETECTION; FUSION; SEGMENTATION;
D O I
10.1109/TIP.2022.3154931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a novel depth-induced multi-scale recurrent attention network for RGB-D saliency detection, named as DMRA. It achieves dramatic performance especially in complex scenarios. There are four main contributions of our network that are experimentally demonstrated to have significant practical merits. First, we design an effective depth refinement block using residual connections to fully extract and fuse cross-modal complementary cues from RGB and depth streams. Second, depth cues with abundant spatial information are innovatively combined with multi-scale contextual features for accurately locating salient objects. Third, a novel recurrent attention module inspired by Internal Generative Mechanism of human brain is designed to generate more accurate saliency results via comprehensively learning the internal semantic relation of the fused feature and progressively optimizing local details with memory-oriented scene understanding. Finally, a cascaded hierarchical feature fusion strategy is designed to promote efficient information interaction of multi-level contextual features and further improve the contextual representability of model. In addition, we introduce a new real-life RGB-D saliency dataset containing a variety of complex scenarios that has been widely used as a benchmark dataset in recent RGB-D saliency detection research. Extensive empirical experiments demonstrate that our method can accurately identify salient objects and achieve appealing performance against 18 state-of-the-art RGB-D saliency models on nine benchmark datasets.
引用
收藏
页码:2321 / 2336
页数:16
相关论文
共 50 条
  • [31] TWO-STREAM REFINEMENT NETWORK FOR RGB-D SALIENCY DETECTION
    Liu, Di
    Hu, Yaosi
    Zhang, Kao
    Chen, Zhenzhong
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3925 - 3929
  • [32] Towards accurate RGB-D saliency detection with complementary attention and adaptive integration
    Bi, Hong-Bo
    Liu, Zi-Qi
    Wang, Kang
    Dong, Bo
    Chen, Geng
    Ma, Ji-Quan
    NEUROCOMPUTING, 2021, 439 : 63 - 74
  • [33] RGB-D Saliency Detection with Multi-feature-fused Optimization
    Zhang, Tianyi
    Yang, Zhong
    Song, Jiarong
    IMAGE AND GRAPHICS (ICIG 2017), PT III, 2017, 10668 : 15 - 26
  • [34] Multi-scale fusion for RGB-D indoor semantic segmentation
    Jiang, Shiyi
    Xu, Yang
    Li, Danyang
    Fan, Runze
    SCIENTIFIC REPORTS, 2022, 12 (01):
  • [35] SLMSF-Net: A Semantic Localization and Multi-Scale Fusion Network for RGB-D Salient Object Detection
    Peng, Yanbin
    Zhai, Zhinian
    Feng, Mingkun
    SENSORS, 2024, 24 (04)
  • [36] Multi-scale fusion for RGB-D indoor semantic segmentation
    Shiyi Jiang
    Yang Xu
    Danyang Li
    Runze Fan
    Scientific Reports, 12 (1)
  • [37] Joint Multiview Segmentation and Localization of RGB-D Images using Depth-Induced Silhouette Consistency
    Zhang, Chi
    Lie, Zhiwei
    Cai, Rui
    Chao, Hongyang
    Rui, Yong
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4031 - 4039
  • [38] RGB-D Salient Object Detection via Feature Fusion and Multi-scale Enhancement
    Wu, Peiliang
    Duan, Liangliang
    Kong, Lingfu
    COMPUTER VISION, CCCV 2015, PT II, 2015, 547 : 359 - 368
  • [39] Traffic Sign Detection Using a Multi-Scale Recurrent Attention Network
    Tian, Yan
    Gelernter, Judith
    Wang, Xun
    Li, Jianyuan
    Yu, Yizhou
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (12) : 4466 - 4475
  • [40] Bilateral Attention Network for RGB-D Salient Object Detection
    Zhang, Zhao
    Lin, Zheng
    Xu, Jun
    Jin, Wen-Da
    Lu, Shao-Ping
    Fan, Deng-Ping
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1949 - 1961