SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

被引：230

作者：

Liu, Zechen ^{[1
]}

Wu, Zizhang ^{[1
]}

Toth, Roland ^{[2
]}

机构：

[1] ZongMu Tech, Beijing, Peoples R China

[2] TU e, Beijing, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年

关键词：

D O I：

10.1109/CVPRW50498.2020.00506

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving. In case of monocular vision, successful methods have been mainly based on two ingredients: (i) a network generating 2D region proposals, (ii) a R-CNN structure predicting 3D object pose by utilizing the acquired regions of interest. We argue that the 2D detection network is redundant and introduces non-negligible noise for 3D detection. Hence, we propose a novel 3D object detection method, named SMOKE, in this paper that predicts a 3D bounding box for each detected object by combining a single keypoint estimate with regressed 3D variables. As a second contribution, we propose a multi-step disentangling approach for constructing the 3D bounding box, which significantly improves both training convergence and detection accuracy. In contrast to previous 3D detection techniques, our method does not require complicated pre/post-processing, extra data, and a refinement stage. Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset, giving the best state-of-the-art result on both 3D object detection and Bird's eye view evaluation. The code is available at https://github.com/lzccccc/SMOKE.

引用

页码：4289 / 4298

页数：10

共 50 条

[21] Single-Stage 6D Object Pose Estimation
Hu, Yinlin
Fua, Pascal
Wang, Wei
Salzmann, Mathieu
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2927 - 2936
[22] Multi-feature enhancement based on sparse networks for single-stage 3D object detection
Ke, Zunwang
Lin, Chenyu
Zhang, Tao
Jia, Tingting
Du, Minghua
Wang, Gang
Zhang, Yugui
ALEXANDRIA ENGINEERING JOURNAL, 2025, 111 : 123 - 135
[23] Monocular 3D Object Detection for Autonomous Driving
Chen, Xiaozhi
Kundu, Kaustav
Zhang, Ziyu
Ma, Huimin
Fidler, Sanja
Urtasun, Raquel
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
[24] Dimension Embeddings for Monocular 3D Object Detection
Zhang, Yunpeng
Zheng, Wenzhao
Zhu, Zheng
Huang, Guan
Du, Dalong
Zhou, Jie
Lu, Jiwen
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588
[25] Pose Anchor: A Single-Stage Hand Keypoint Detection Network
Li, Yuan
Wang, Xinggang
Liu, Wenyu
Feng, Bin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 2104 - 2113
[26] Multivariate Probabilistic Monocular 3D Object Detection
Shi, Xuepeng
Chen, Zhixiang
Kim, Tae-Kyun
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4270 - 4279
[27] Uncertainty Prediction for Monocular 3D Object Detection
Mun, Junghwan
Choi, Hyukdoo
SENSORS, 2023, 23 (12)
[28] Monocular 3D object detection for distant objects
Li, Jiahao
Han, Xiaohong
JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (03) : 33021
[29] Homography Loss for Monocular 3D Object Detection
Gu, Jiaqi
Wu, Bojian
Fan, Lubin
Huang, Jianqiang
Cao, Shen
Xiang, Zhiyu
Hua, Xian-Sheng
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1070 - 1079
[30] MoGDE: Boosting Mobile Monocular 3D Object Detection with Ground Depth Estimation
Zhou, Yunsong
Liu, Quan
Zhu, Hongzi
Li, Yunzhe
Chang, Shan
Guo, Minyi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →