Salient Object Detection in RGB-D Videos

被引:0
|
作者
Mou, Ao [1 ]
Lu, Yukang [1 ]
He, Jiahao [1 ]
Min, Dingyao [1 ]
Fu, Keren [1 ]
Zhao, Qijun [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
关键词
Salient object detection; RGB-D videos; depth; optical flow; multi-modal fusion; NETWORK; OPTIMIZATION; FUSION;
D O I
10.1109/TIP.2024.3498326
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given the widespread adoption of depth-sensing acquisition devices, RGB-D videos and related data/media have gained considerable traction in various aspects of daily life. Consequently, conducting salient object detection (SOD) in RGB-D videos presents a highly promising and evolving avenue. Despite the potential of this area, SOD in RGB-D videos remains somewhat under-explored, with RGB-D SOD and video SOD (VSOD) traditionally studied in isolation. To explore this emerging field, this paper makes two primary contributions: the dataset and the model. On one front, we construct the RDVS dataset, a new RGB-D VSOD dataset with realistic depth and characterized by its diversity of scenes and rigorous frame-by-frame annotations. We validate the dataset through comprehensive attribute and object-oriented analyses, and provide training and testing splits. Moreover, we introduce DCTNet+, a three-stream network tailored for RGB-D VSOD, with an emphasis on RGB modality and treats depth and optical flow as auxiliary modalities. In pursuit of effective feature enhancement, refinement, and fusion for precise final prediction, we propose two modules: the multi-modal attention module (MAM) and the refinement fusion module (RFM). To enhance interaction and fusion within RFM, we design a universal interaction module (UIM) and then integrate holistic multi-modal attentive paths (HMAPs) for refining multi-modal low-level features before reaching RFMs. Comprehensive experiments, conducted on pseudo RGB-D video datasets alongside our proposed RDVS, highlight the superiority of DCTNet+ over 19 VSOD models and 14 RGB-D SOD models. Additionally, insightful ablation experiments were performed on both pseudo and realistic RGB-D video datasets to demonstrate the advantages of individual modules as well as the necessity of introducing realistic depth into VSOD. Our code together with RDVS dataset will be available at https://github.com/kerenfu/RDVS/.
引用
收藏
页码:6660 / 6675
页数:16
相关论文
共 50 条
  • [31] Feature Calibrating and Fusing Network for RGB-D Salient Object Detection
    Zhang, Qiang
    Qin, Qi
    Yang, Yang
    Jiao, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1493 - 1507
  • [32] GroupTransNet: Group transformer network for RGB-D salient object detection
    Fang, Xian
    Jiang, Mingfeng
    Zhu, Jinchao
    Shao, Xiuli
    Wang, Hongpeng
    NEUROCOMPUTING, 2024, 594
  • [33] Asymmetric deep interaction network for RGB-D salient object detection
    Wang, Feifei
    Li, Yongming
    Wang, Liejun
    Zheng, Panpan
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 266
  • [34] Triple-Complementary Network for RGB-D Salient Object Detection
    Huang, Rui
    Xing, Yan
    Zou, Yaobin
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 775 - 779
  • [35] Self-Supervised Pretraining for RGB-D Salient Object Detection
    Zhao, Xiaoqi
    Pang, Youwei
    Zhang, Lihe
    Lu, Huchuan
    Ruan, Xiang
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3463 - 3471
  • [36] DMNet: Dynamic Memory Network for RGB-D Salient Object Detection
    Du, Haishun
    Zhang, Zhen
    Zhang, Minghao
    Qiao, Kangyi
    DIGITAL SIGNAL PROCESSING, 2023, 142
  • [37] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
    Li, Gongyang
    Liu, Zhi
    Chen, Minyu
    Bai, Zhen
    Lin, Weisi
    Ling, Haibin
    IEEE Transactions on Image Processing, 2021, 30 : 3528 - 3542
  • [38] HOSO: Histogram Of Surface Orientation for RGB-D Salient Object Detection
    Feng, David
    Barnes, Nick
    You, Shaodi
    2017 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING - TECHNIQUES AND APPLICATIONS (DICTA), 2017, : 291 - 298
  • [39] An adaptive guidance fusion network for RGB-D salient object detection
    Sun, Haodong
    Wang, Yu
    Ma, Xinpeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1683 - 1693
  • [40] Delving into Calibrated Depth for Accurate RGB-D Salient Object Detection
    Li, Jingjing
    Ji, Wei
    Zhang, Miao
    Piao, Yongri
    Lu, Huchuan
    Cheng, Li
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (04) : 855 - 876