Monocular three-dimensional object detection using data augmentation and self-supervised learning in autonomous driving

被引：1

作者：

Thayalan, Sugirtha ^{[1
]}

Muthukumarasamy, Sridevi ^{[1
]}

Santhakumar, Khailash ^{[2
]}

Ravi, Kiran Bangalore ^{[3
]}

Liu, Hao ^{[3
]}

Gauthier, Thomas ^{[3
]}

Yogamani, Senthil ^{[4
]}

机构：

[1] Natl Inst Technol, Dept Comp Sci & Engn, Trichy, Tamil Nadu, India

[2] SASTRA Univ, Thanjavur, Tamilnadu, India

[3] Navya, Paris, France

[4] Valeo Vis Syst, Comp Vis Platform, Tuam, Ireland

来源：

JOURNAL OF ELECTRONIC IMAGING | 2023年 / 32卷 / 01期

关键词：

monocular three-dimensional detection; data augmentation; self-supervised learning;

D O I：

10.1117/1.JEI.32.1.011004

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Monocular three-dimensional (3D) object detection (OD) is an essential and challenging task in the domain of autonomous driving. Modern convolution neural network-based architectures for OD heavily rely on data augmentation (DA) and self-supervised learning (SSL). However, they have been relatively less explored for monocular 3D OD, especially in the field of autonomous driving. DAs for two-dimensional OD techniques do not directly extend to the 3D objects. Literature shows that this requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoints. This requires accurate depth information of the scene which may not be available always. We propose augmentations for monocular 3D OD without creating view synthesis. The proposed method uses DA with SSL approach via multiobject labeling as the pretext task. We evaluate the proposed DA-SSL approach on RTM3D detection model (baseline), with and without the application of DA. The results demonstrate improvements between 2% and 3% in mAP 3D and 0.9% to 1.5% BEV scores using SSL over the baseline scores. We propose an inverse class frequency weighted (ICFW) mAP score that highlights improvements in detection for low-frequency classes in a class imbalanced datasets with long tails. We observe improvements in both ICFW mAP 3D and Bird's Eye View (BEV) scores to take into account the class imbalance in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) validation dataset. We achieve 4% to 5% increase in ICFW metrics with the pretext task.

引用

页数：19

共 50 条

[21] Self-supervised Obstacle Detection for Humanoid Navigation Using Monocular Vision and Sparse Laser Data
Maier, Daniel
Bennewitz, Maren
Stachniss, Cyrill
2011 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2011, : 1263 - 1269
[22] Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Wu, Huimin
Lei, Chenyang
Sun, Xiao
Wang, Peng-Shuai
Chen, Qifeng
Cheng, Kwang-Ting
Lin, Stephen
Wu, Zhirong
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16259 - 16270
[23] SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
Kumar, Varun Ravi
Klingner, Marvin
Yogamani, Senthil
Milz, Stefan
Fingscheidt, Tim
Maeder, Patrick
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 61 - 71
[24] A Self-supervised Learning System for Object Detection in Videos Using Random Walks on Graphs
Tan, Juntao
Song, Changkyu
Boularias, Abdeslam
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14061 - 14068
[25] Self-Supervised Object Detection and Retrieval Using Unlabeled Videos
Amrani, Elad
Ben-Ari, Rami
Shapira, Inbar
Hakim, Tal
Bronstein, Alex
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4100 - 4108
[26] Self-Supervised Learning for Online Anomaly Detection in High-Dimensional Data Streams
Mozaffari, Mahsa
Doshi, Keval
Yilmaz, Yasin
ELECTRONICS, 2023, 12 (09)
[27] Self-Supervised Multi-Object Tracking for Autonomous Driving From Consistency Across Timescales
Lang C.
Braun A.
Schillingmann L.
Valada A.
IEEE Robotics and Automation Letters, 2023, 8 (11) : 7711 - 7718
[28] Depth Estimation Based on Monocular Camera Sensors in Autonomous Vehicles: A Self-supervised Learning Approach
Li, Guofa
Chi, Xingyu
Qu, Xingda
AUTOMOTIVE INNOVATION, 2023, 6 (02) : 268 - 280
[29] Depth Estimation Based on Monocular Camera Sensors in Autonomous Vehicles: A Self-supervised Learning Approach
Guofa Li
Xingyu Chi
Xingda Qu
Automotive Innovation, 2023, 6 : 268 - 280
[30] Self-supervised contrastive learning of radio data for source detection, classification and peculiar object discovery
Riggi, S.
Cecconello, T.
Palazzo, S.
Hopkins, A. M.
Gupta, N.
Bordiu, C.
Ingallinera, A.
Buemi, C.
Bufano, F.
Cavallaro, F.
Filipovic, M. D.
Leto, P.
Loru, S.
Ruggeri, A. C.
Trigilio, C.
Umana, G.
Vitello, F.
PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF AUSTRALIA, 2024, 41

← 1 2 3 4 5 →