Monocular three-dimensional object detection using data augmentation and self-supervised learning in autonomous driving

被引:1
|
作者
Thayalan, Sugirtha [1 ]
Muthukumarasamy, Sridevi [1 ]
Santhakumar, Khailash [2 ]
Ravi, Kiran Bangalore [3 ]
Liu, Hao [3 ]
Gauthier, Thomas [3 ]
Yogamani, Senthil [4 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Trichy, Tamil Nadu, India
[2] SASTRA Univ, Thanjavur, Tamilnadu, India
[3] Navya, Paris, France
[4] Valeo Vis Syst, Comp Vis Platform, Tuam, Ireland
关键词
monocular three-dimensional detection; data augmentation; self-supervised learning;
D O I
10.1117/1.JEI.32.1.011004
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Monocular three-dimensional (3D) object detection (OD) is an essential and challenging task in the domain of autonomous driving. Modern convolution neural network-based architectures for OD heavily rely on data augmentation (DA) and self-supervised learning (SSL). However, they have been relatively less explored for monocular 3D OD, especially in the field of autonomous driving. DAs for two-dimensional OD techniques do not directly extend to the 3D objects. Literature shows that this requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoints. This requires accurate depth information of the scene which may not be available always. We propose augmentations for monocular 3D OD without creating view synthesis. The proposed method uses DA with SSL approach via multiobject labeling as the pretext task. We evaluate the proposed DA-SSL approach on RTM3D detection model (baseline), with and without the application of DA. The results demonstrate improvements between 2% and 3% in mAP 3D and 0.9% to 1.5% BEV scores using SSL over the baseline scores. We propose an inverse class frequency weighted (ICFW) mAP score that highlights improvements in detection for low-frequency classes in a class imbalanced datasets with long tails. We observe improvements in both ICFW mAP 3D and Bird's Eye View (BEV) scores to take into account the class imbalance in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) validation dataset. We achieve 4% to 5% increase in ICFW metrics with the pretext task.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Self-supervised Obstacle Detection for Humanoid Navigation Using Monocular Vision and Sparse Laser Data
    Maier, Daniel
    Bennewitz, Maren
    Stachniss, Cyrill
    2011 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2011, : 1263 - 1269
  • [22] Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
    Wu, Huimin
    Lei, Chenyang
    Sun, Xiao
    Wang, Peng-Shuai
    Chen, Qifeng
    Cheng, Kwang-Ting
    Lin, Stephen
    Wu, Zhirong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16259 - 16270
  • [23] SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
    Kumar, Varun Ravi
    Klingner, Marvin
    Yogamani, Senthil
    Milz, Stefan
    Fingscheidt, Tim
    Maeder, Patrick
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 61 - 71
  • [24] A Self-supervised Learning System for Object Detection in Videos Using Random Walks on Graphs
    Tan, Juntao
    Song, Changkyu
    Boularias, Abdeslam
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14061 - 14068
  • [25] Self-Supervised Object Detection and Retrieval Using Unlabeled Videos
    Amrani, Elad
    Ben-Ari, Rami
    Shapira, Inbar
    Hakim, Tal
    Bronstein, Alex
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4100 - 4108
  • [26] Self-Supervised Learning for Online Anomaly Detection in High-Dimensional Data Streams
    Mozaffari, Mahsa
    Doshi, Keval
    Yilmaz, Yasin
    ELECTRONICS, 2023, 12 (09)
  • [27] Self-Supervised Multi-Object Tracking for Autonomous Driving From Consistency Across Timescales
    Lang C.
    Braun A.
    Schillingmann L.
    Valada A.
    IEEE Robotics and Automation Letters, 2023, 8 (11) : 7711 - 7718
  • [28] Depth Estimation Based on Monocular Camera Sensors in Autonomous Vehicles: A Self-supervised Learning Approach
    Li, Guofa
    Chi, Xingyu
    Qu, Xingda
    AUTOMOTIVE INNOVATION, 2023, 6 (02) : 268 - 280
  • [29] Depth Estimation Based on Monocular Camera Sensors in Autonomous Vehicles: A Self-supervised Learning Approach
    Guofa Li
    Xingyu Chi
    Xingda Qu
    Automotive Innovation, 2023, 6 : 268 - 280
  • [30] Self-supervised contrastive learning of radio data for source detection, classification and peculiar object discovery
    Riggi, S.
    Cecconello, T.
    Palazzo, S.
    Hopkins, A. M.
    Gupta, N.
    Bordiu, C.
    Ingallinera, A.
    Buemi, C.
    Bufano, F.
    Cavallaro, F.
    Filipovic, M. D.
    Leto, P.
    Loru, S.
    Ruggeri, A. C.
    Trigilio, C.
    Umana, G.
    Vitello, F.
    PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF AUSTRALIA, 2024, 41