Weakly-Supervised RGBD Video Object Segmentation

被引:0
|
作者
Yang, Jinyu [1 ,2 ]
Gao, Mingqi [1 ,3 ]
Zheng, Feng [4 ]
Zhen, Xiantong [5 ]
Ji, Rongrong [6 ]
Shao, Ling [7 ]
Leonardis, Ales [8 ]
机构
[1] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen 518055, Peoples R China
[2] Univ Birmingham, Birmingham B15 2TT, England
[3] Univ Warwick, Coventry CV4 7AL, England
[4] Southern Univ Sci & Technol, Shenzhen 518055, Peoples R China
[5] Guangdong Univ Petrochem Technol, Coll Comp Sci, Maoming 525011, Peoples R China
[6] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Media Analyt & Comp Lab, Xiamen 361005, Peoples R China
[7] Univ Chinese Acad Sci, UCAS Terminus AI Lab, Beijing 101408, Peoples R China
[8] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
基金
中国国家自然科学基金;
关键词
Annotations; Object segmentation; Training; Target tracking; Task analysis; Object tracking; Benchmark testing; RGBD data; video object segmentation; visual tracking; TRACKING;
D O I
10.1109/TIP.2024.3374130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depth information opens up new opportunities for video object segmentation (VOS) to be more accurate and robust in complex scenes. However, the RGBD VOS task is largely unexplored due to the expensive collection of RGBD data and time-consuming annotation of segmentation. In this work, we first introduce a new benchmark for RGBD VOS, named DepthVOS, which contains 350 videos (over 55k frames in total) annotated with masks and bounding boxes. We futher propose a novel, strong baseline model - Fused Color-Depth Network (FusedCDNet), which can be trained solely under the supervision of bounding boxes, while being used to generate masks with a bounding box guideline only in the first frame. Thereby, the model possesses three major advantages: a weakly-supervised training strategy to overcome the high-cost annotation, a cross-modal fusion module to handle complex scenes, and weakly-supervised inference to promote ease of use. Extensive experiments demonstrate that our proposed method performs on par with top fully-supervised algorithms. We will open-source our project on https://github.com/yjybuaa/depthvos/ to facilitate the development of RGBD VOS.
引用
收藏
页码:2158 / 2170
页数:13
相关论文
共 50 条
  • [41] Weakly-supervised object localization in unlabeled image collection
    Yanyun Qu
    Han Liu
    Xiaoqing Yang
    Suwen Fang
    Hanzi Wang
    Multimedia Systems, 2013, 19 : 51 - 63
  • [42] Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning
    Xiang Wang
    Sifei Liu
    Huimin Ma
    Ming-Hsuan Yang
    International Journal of Computer Vision, 2020, 128 : 1736 - 1749
  • [43] A weakly-supervised follicle segmentation method in ultrasound images
    Guanyu Liu
    Weihong Huang
    Yanping Li
    Qiong Zhang
    Jing Fu
    Hongying Tang
    Jia Huang
    Zhongteng Zhang
    Lei Zhang
    Yu Wang
    Jianzhong Hu
    Scientific Reports, 15 (1)
  • [44] Normalized Cut Loss for Weakly-supervised CNN Segmentation
    Tang, Meng
    Djelouah, Abdelaziz
    Perazzi, Federico
    Boykov, Yuri
    Schroers, Christopher
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1818 - 1827
  • [45] Weakly-supervised learning approach for potato defects segmentation
    Marino, Sofia
    Beauseroy, Pierre
    Smolarz, Andre
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 85 : 337 - 346
  • [46] Learning Visual Words for Weakly-Supervised Semantic Segmentation
    Ru, Lixiang
    Du, Bo
    Wu, Chen
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 982 - 988
  • [47] ALWOD: Active Learning for Weakly-Supervised Object Detection
    Wang, Yuting
    Ilic, Velibor
    Li, Jiatong
    Kisacanin, Branislav
    Pavlovic, Vladimir
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6436 - 6446
  • [48] Weakly-supervised object localization in unlabeled image collection
    Qu, Yanyun
    Liu, Han
    Yang, Xiaoqing
    Fang, Suwen
    Wang, Hanzi
    MULTIMEDIA SYSTEMS, 2013, 19 (01) : 51 - 63
  • [49] Efficient Weakly-Supervised Object Detection With Pseudo Annotations
    Yuan, Qingsheng
    Sun, Gang
    Liang, Jianming
    Leng, Biao
    IEEE ACCESS, 2021, 9 : 104356 - 104366
  • [50] Weakly-Supervised Semantic Segmentation by Learning Label Uncertainty
    Neven, Robby
    Neven, Davy
    De Brabandere, Bert
    Proesmans, Marc
    Goedeme, Toon
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1678 - 1686