Monocular 3D Object Detection Utilizing Auxiliary Learning With Deformable Convolution

被引:2
|
作者
Chen, Jiun-Han [1 ]
Shieh, Jeng-Lun [1 ]
Haq, Muhamad Amirul [1 ]
Ruan, Shanq-Jang [1 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Elect & Comp Engn, Taipei 10607, Taiwan
关键词
Three-dimensional displays; Object detection; Solid modeling; Feature extraction; Training; Computational modeling; Task analysis; 3D object detection; monocular camera; driving scene understanding; auxiliary learning; deep learning;
D O I
10.1109/TITS.2023.3319556
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In autonomous driving systems, the monocular 3D object detection algorithm is a crucial component. The safety of autonomous vehicles heavily depends on a well-designed detection system. Therefore, developing a robust and efficient 3D object detection algorithm is a major goal for institutes and researchers. Having a 3D sense is essential in autonomous vehicles and robotics, as it allows the system to understand its surroundings and react accordingly. Compared with stereo-based and Lidar-based methods, monocular 3D Object detection is a challenging task as it only utilizes 2D information to generate complex 3D features, making it low-cost, less computationally intensive, and with great potential. However, the performance of monocular methods is impaired due to the lack of depth information. In this paper, we propose a simple, end-to-end, and effective network for monocular 3D object detection without the use of external training data. Our work is inspired by auxiliary learning, in which we use a robust feature extractor as our backbone and multiple regression heads to learn auxiliary knowledge. These auxiliary regression heads will be discarded after training for improved inference efficiency, allowing us to take advantage of auxiliary learning and enabling the model to learn critical information more conceptually. The proposed method achieves 17.28% and 20.10% for the moderate level of the Car category on the KITTI benchmark test set and validation set, respectively, which outperforms the previous monocular 3D object detection approaches.
引用
收藏
页码:2424 / 2436
页数:13
相关论文
共 50 条
  • [41] Monocular 3D object detection for construction scene analysis
    Shen, Jie
    Jiao, Lang
    Zhang, Cong
    Peng, Keran
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, 39 (09) : 1370 - 1389
  • [42] Delving into Localization Errors for Monocular 3D Object Detection
    Ma, Xinzhu
    Zhang, Yinmin
    Xu, Dan
    Zhou, Dongzhan
    Yi, Shuai
    Li, Haojie
    Ouyang, Wanli
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4719 - 4728
  • [43] Shape-Aware Monocular 3D Object Detection
    Chen, Wei
    Zhao, Jie
    Zhao, Wan-Lei
    Wu, Song-Yuan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (06) : 6416 - 6424
  • [44] Competition for roadside camera monocular 3D object detection
    Jinrang Jia
    Yifeng Shi
    Yuli Qu
    Rui Wang
    Xing Xu
    Hai Zhang
    NationalScienceReview, 2023, 10 (06) : 34 - 37
  • [45] MonoGRNet: A General Framework for Monocular 3D Object Detection
    Qin, Zengyi
    Wang, Jinglu
    Lu, Yan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 5170 - 5184
  • [46] Monocular 3D Tracking of Deformable Surfaces
    Puig, Luis
    Daniilidis, Kostas
    2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 580 - 586
  • [47] Object-Aware Centroid Voting for Monocular 3D Object Detection
    Bao, Wentao
    Yu, Qi
    Kong, Yu
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 2197 - 2204
  • [48] Virtual Sparse Convolution for Multimodal 3D Object Detection
    Wu, Hai
    Wen, Chenglu
    Shi, Shaoshuai
    Li, Xin
    Wang, Cheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21653 - 21662
  • [49] Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information
    Hu, Henan
    Zhu, Ming
    Li, Muyu
    Chan, Kwok-Leung
    SENSORS, 2022, 22 (07)
  • [50] Selective Transfer Learning of Cross-Modality Distillation for Monocular 3D Object Detection
    Ding, Rui
    Yang, Meng
    Zheng, Nanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9925 - 9938