SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

被引:230
|
作者
Liu, Zechen [1 ]
Wu, Zizhang [1 ]
Toth, Roland [2 ]
机构
[1] ZongMu Tech, Beijing, Peoples R China
[2] TU e, Beijing, Peoples R China
关键词
D O I
10.1109/CVPRW50498.2020.00506
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving. In case of monocular vision, successful methods have been mainly based on two ingredients: (i) a network generating 2D region proposals, (ii) a R-CNN structure predicting 3D object pose by utilizing the acquired regions of interest. We argue that the 2D detection network is redundant and introduces non-negligible noise for 3D detection. Hence, we propose a novel 3D object detection method, named SMOKE, in this paper that predicts a 3D bounding box for each detected object by combining a single keypoint estimate with regressed 3D variables. As a second contribution, we propose a multi-step disentangling approach for constructing the 3D bounding box, which significantly improves both training convergence and detection accuracy. In contrast to previous 3D detection techniques, our method does not require complicated pre/post-processing, extra data, and a refinement stage. Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset, giving the best state-of-the-art result on both 3D object detection and Bird's eye view evaluation. The code is available at https://github.com/lzccccc/SMOKE.
引用
收藏
页码:4289 / 4298
页数:10
相关论文
共 50 条
  • [31] DCGNN: a single-stage 3D object detection network based on density clustering and graph neural network
    Xiong, Shimin
    Li, Bin
    Zhu, Shiao
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (03) : 3399 - 3408
  • [32] DCGNN: a single-stage 3D object detection network based on density clustering and graph neural network
    Shimin Xiong
    Bin Li
    Shiao Zhu
    Complex & Intelligent Systems, 2023, 9 : 3399 - 3408
  • [33] Instance-aware sampling and voxel-transformer encoding for single-stage 3D object detection
    Wang, Baotong
    Xia, Chenxing
    Gao, Xiuju
    Yang, Yuan
    Li, Kuan-Ching
    Fang, Xianjin
    Zhang, Yan
    Ge, Sijia
    DIGITAL SIGNAL PROCESSING, 2025, 162
  • [34] FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection
    Wang, Tai
    Zhu, Xinge
    Pang, Jiangmiao
    Lin, Dahua
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 913 - 922
  • [35] OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection
    Ji, Mingqian
    Yang, Jian
    Zhang, Shanshan
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 3 - 15
  • [36] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
    Liu, Xianpeng
    Xue, Nan
    Wu, Tianfu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1810 - 1818
  • [37] Single-Stage Keypoint-Based Category-Level Object Pose Estimation from an RGB Image
    Lin, Yunzhi
    Tremblay, Jonathan
    Tyree, Stephen
    Vela, Patricio A.
    Birchfield, Stan
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 1547 - 1553
  • [38] One Stage Monocular 3D Object Detection Utilizing Discrete Depth and Orientation Representation
    Haq, Muhamad Amirul
    Ruan, Shanq-Jang
    Shao, Mei-En
    ul Haq, Qazi Mazhar
    Liang, Pei-Jung
    Gao, De-Qin
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 21630 - 21640
  • [39] Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving
    Ma, Xinzhu
    Wang, Zhihui
    Li, Haojie
    Zhang, Pengbo
    Ouyang, Wanli
    Fan, Xin
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6850 - 6859
  • [40] Single-Stage Refinement CNN for Depth Estimation in Monocular Images
    Valdez Rodriguez, Jose E.
    Calvo, Hiram
    Felipe Riveron, Edgardo M.
    COMPUTACION Y SISTEMAS, 2020, 24 (02): : 439 - 451