SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

被引：230

作者：

Liu, Zechen ^{[1
]}

Wu, Zizhang ^{[1
]}

Toth, Roland ^{[2
]}

机构：

[1] ZongMu Tech, Beijing, Peoples R China

[2] TU e, Beijing, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年

关键词：

D O I：

10.1109/CVPRW50498.2020.00506

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving. In case of monocular vision, successful methods have been mainly based on two ingredients: (i) a network generating 2D region proposals, (ii) a R-CNN structure predicting 3D object pose by utilizing the acquired regions of interest. We argue that the 2D detection network is redundant and introduces non-negligible noise for 3D detection. Hence, we propose a novel 3D object detection method, named SMOKE, in this paper that predicts a 3D bounding box for each detected object by combining a single keypoint estimate with regressed 3D variables. As a second contribution, we propose a multi-step disentangling approach for constructing the 3D bounding box, which significantly improves both training convergence and detection accuracy. In contrast to previous 3D detection techniques, our method does not require complicated pre/post-processing, extra data, and a refinement stage. Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset, giving the best state-of-the-art result on both 3D object detection and Bird's eye view evaluation. The code is available at https://github.com/lzccccc/SMOKE.

引用

页码：4289 / 4298

页数：10

共 50 条

[41] Monocular 3D Pose Estimation and Tracking by Detection
Andriluka, Mykhaylo
Roth, Stefan
Schiele, Bernt
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 623 - 630
[42] Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver
Liu, Xianpeng
Zheng, Ce
Cheng, Kelvin
Xue, Nan
Qi, Guo-Jun
Wu, Tianfu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6413 - 6423
[43] Progressive Coordinate Transforms for Monocular 3D Object Detection
Wang, Li
Zhang, Li
Zhu, Yi
Zhang, Zhi
He, Tong
Li, Mu
Xue, Xiangyang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[44] Exploring Geometric Consistency for Monocular 3D Object Detection
Lian, Qing
Ye, Botao
Xu, Ruijia
Yao, Weilong
Zhang, Tong
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1675 - 1684
[45] MonoSG: Monocular 3D Object Detection With Stereo Guidance
Fan, Zhiwei
Xu, Chao
Chu, Minghang
Huang, Yuling
Ma, Yaoyao
Wang, Jing
Xu, Yishen
Wu, Di
IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (04): : 3604 - 3611
[46] Monocular 3D Object Detection With Motion Feature Distillation
Hu, Henan
Li, Muyu
Zhu, Ming
Gao, Wen
Liu, Peiyu
Chan, Kwok-Leung
IEEE ACCESS, 2023, 11 : 82933 - 82945
[47] Monocular Object Detection Using 3D Geometric Primitives
Carr, Peter
Sheikh, Yaser
Matthews, Iain
COMPUTER VISION - ECCV 2012, PT I, 2012, 7572 : 864 - 878
[48] Dense-JANet for Monocular 3D Object Detection
Shang, Xiaoqing
Cheng, Zhiwei
Shi, Su
Cheng, Zhuanghao
Huang, Hongcheng
2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
[49] Monocular 3D Object Detection from Roadside Infrastructure
Huang, Delu
Wen, Feng
2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1672 - 1677
[50] MonoCD: Monocular 3D Object Detection with Complementary Depths
Yan, Longfei
Yan, Pei
Xiong, Shengzhou
Xiang, Xuanyu
Tan, Yihua
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10248 - 10257

← 1 2 3 4 5 →