Monocular three-dimensional object detection using data augmentation and self-supervised learning in autonomous driving

被引:1
|
作者
Thayalan, Sugirtha [1 ]
Muthukumarasamy, Sridevi [1 ]
Santhakumar, Khailash [2 ]
Ravi, Kiran Bangalore [3 ]
Liu, Hao [3 ]
Gauthier, Thomas [3 ]
Yogamani, Senthil [4 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Trichy, Tamil Nadu, India
[2] SASTRA Univ, Thanjavur, Tamilnadu, India
[3] Navya, Paris, France
[4] Valeo Vis Syst, Comp Vis Platform, Tuam, Ireland
关键词
monocular three-dimensional detection; data augmentation; self-supervised learning;
D O I
10.1117/1.JEI.32.1.011004
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Monocular three-dimensional (3D) object detection (OD) is an essential and challenging task in the domain of autonomous driving. Modern convolution neural network-based architectures for OD heavily rely on data augmentation (DA) and self-supervised learning (SSL). However, they have been relatively less explored for monocular 3D OD, especially in the field of autonomous driving. DAs for two-dimensional OD techniques do not directly extend to the 3D objects. Literature shows that this requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoints. This requires accurate depth information of the scene which may not be available always. We propose augmentations for monocular 3D OD without creating view synthesis. The proposed method uses DA with SSL approach via multiobject labeling as the pretext task. We evaluate the proposed DA-SSL approach on RTM3D detection model (baseline), with and without the application of DA. The results demonstrate improvements between 2% and 3% in mAP 3D and 0.9% to 1.5% BEV scores using SSL over the baseline scores. We propose an inverse class frequency weighted (ICFW) mAP score that highlights improvements in detection for low-frequency classes in a class imbalanced datasets with long tails. We observe improvements in both ICFW mAP 3D and Bird's Eye View (BEV) scores to take into account the class imbalance in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) validation dataset. We achieve 4% to 5% increase in ICFW metrics with the pretext task.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] A SELF-SUPERVISED LEARNING TECHNIQUE FOR ROAD DEFECTS DETECTION BASED ON MONOCULAR THREE-DIMENSIONAL RECONSTRUCTION
    Hu, Yazhe
    Furukawa, Tomonari
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2019, VOL 3, 2020,
  • [2] Self-Supervised Pretraining for Point Cloud Object Detection in Autonomous Driving
    Shi, Weijing
    Rajkumar, Ragunathan
    2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 4341 - 4348
  • [3] SSTN: Self-Supervised Domain Adaptation Thermal Object Detection for Autonomous Driving
    Munir, Farzeen
    Azam, Shoaib
    Jeon, Moongu
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 206 - 213
  • [4] Bootstrapping Autonomous Driving Radars with Self-Supervised Learning
    Hao, Yiduo
    Madani, Sohrab
    Guan, Junfeng
    Alloulah, Mohammed
    Gupta, Saurabh
    Hassanieh, Haitham
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 15012 - 15023
  • [5] Self-Supervised Pillar Motion Learning for Autonomous Driving
    Luo, Chenxu
    Yang, Xiaodong
    Yuille, Alan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3182 - 3191
  • [6] PDANet: Self-Supervised Monocular Depth Estimation Using Perceptual and Data Augmentation Consistency
    Gao, Huachen
    Liu, Xiaoyu
    Qu, Meixia
    Huang, Shijie
    APPLIED SCIENCES-BASEL, 2021, 11 (12):
  • [7] A monocular three-dimensional object detection model based on uncertainty-guided depth combination for autonomous driving
    Zhou, Xin
    Xu, Xiaolong
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 120
  • [8] Self-supervised learning of monocular depth estimators in autonomous vehicles with federated learning
    Soares, Elton F. de S.
    Campos, Carlos Alberto V.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 151
  • [9] Self-Supervised Object Distance Estimation Using a Monocular Camera
    Liang, Hong
    Ma, Zizhen
    Zhang, Qian
    SENSORS, 2022, 22 (08)
  • [10] Self-Supervised Feature Augmentation for Large Image Object Detection
    Pan, Xingjia
    Tang, Fan
    Dong, Weiming
    Gu, Yang
    Song, Zhichao
    Meng, Yiping
    Xu, Pengfei
    Deussen, Oliver
    Xu, Changsheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 6745 - 6758