Semantic scene completion with dense CRF from a single depth image

被引:22
|
作者
Zhang, Liang [1 ]
Wang, Le [2 ]
Zhang, Xiangdong [2 ]
Shen, Peiyi [1 ]
Bennamoun, Mohammed [3 ]
Zhu, Guangming [1 ]
Shah, Syed Afaq Ali [4 ]
Song, Juan [1 ]
机构
[1] Xidian Univ, Sch Software Engn, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China
[3] Univ Western Australia, Sch Comp Sci & Software Engn, Perth, WA 6009, Australia
[4] Cent Queensland Univ, Sch Engn & Technol, Perth, WA 6000, Australia
基金
澳大利亚研究理事会;
关键词
Semantic scene completion; Single depth image; Dense conditional random field (CRF); Truncated signed distance function (TSDF); Inference; OBJECT DETECTION; FEATURES; CNN;
D O I
10.1016/j.neucom.2018.08.052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene understanding is a significant research topic in computer vision, especially for robots to understand their environment intelligently. Semantic scene segmentation can help robots to identify the objects that are present in their surroundings, while semantic scene completion can enhance the ability of the robot to infer the object shape, which is pivotal for several high-level tasks. With dense Conditional Random Field (CRF), one key issue is how to construct the long-range interactions between nodes with Gaussian pairwise potentials. Another issue is what effective and efficient inference algorithms can be adapted to resolve the optimization. In this paper, we focus on semantic scene segmentation and completion optimization technology simultaneously using dense CRF based on a single depth image only. Firstly, we convert the single depth image into different down-sampled Truncated Signed Distance Function (TSDF) or flipped TSDF voxel formats, and formulate the pairwise potentials terms with such a representation. Secondly, we use the output results of an end-to-end 3D convolutional neural network named SSCNet to obtain the unary potentials. Finally, we pursue the efficiency of different CRF inference algorithms (the mean-field inference, the negative semi-definite specific difference of convex relaxation, the proximal minimization of linear programming and its variants, etc.). The proposed dense CRF and inference algorithms are evaluated on three different datasets (SUNCG, NYU, and NYUCAD). Experimental results demonstrate that the voxel-level intersection over union (IoU) of predicted voxel's semantic and completion can reach to state-of-the-art. Specifically, for voxel semantic segmentation, the highest IoU improvements are 2.6%, 1.3%, 3.1%, and for scene completion, the highest IoU improvements are 2.5%, 3.7%, 5.4%, respectively for SUNCG, NYU, and NYUCAD datasets. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:182 / 195
页数:14
相关论文
共 50 条
  • [1] Semantic Scene Completion from a Single Depth Image
    Song, Shuran
    Yu, Fisher
    Zeng, Andy
    Chang, Angel X.
    Savva, Manolis
    Funkhouser, Thomas
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 190 - 198
  • [2] View-Volume Network for Semantic Scene Completion from a Single Depth Image
    Guo, Yuxiao
    Tong, Xin
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 726 - 732
  • [3] Semantic Scene Completion from a Single 360-Degree Image and Depth Map
    Dourado, Aloisio
    Kim, Hansung
    de Campos, Teofilo E.
    Hilton, Adrian
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 36 - 46
  • [4] Adversarial Semantic Scene Completion from a Single Depth mage
    Wang, Yida
    Tan, David Joseph
    Navab, Nassir
    Tombari, Federico
    2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, : 426 - 434
  • [5] 3D SEMANTIC SCENE COMPLETION FROM A SINGLE DEPTH IMAGE USING ADVERSARIAL TRAINING
    Chen, Yueh-Tung
    Garbade, Martin
    Gall, Juergen
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1835 - 1839
  • [6] In Depth Bayesian Semantic Scene Completion
    Gillsjo, David
    Astrom, Kalle
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6335 - 6342
  • [7] EdgeNet: Semantic Scene Completion from a Single RGB-D Image
    Dourado, Aloisio
    De Campos, Teofilo E.
    Kim, Hansung
    Hilton, Adrian
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 503 - 510
  • [8] ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image
    Wang, Yida
    Tan, David Joseph
    Navab, Nassir
    Tombari, Federico
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8607 - 8616
  • [9] Scene Intrinsics and Depth from a Single Image
    Shelhamer, Evan
    Barron, Jonathan T.
    Darrell, Trevor
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, : 235 - 242
  • [10] Combining Semantic Scene Priors and Haze Removal for Single Image Depth Estimation
    Wang, Ke
    Dunn, Enrique
    Tighe, Joseph
    Frahm, Jan-Michael
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 800 - 807