DeepSTEP - Deep Learning-Based Spatio-Temporal End-To-End Perception for Autonomous Vehicles

被引：1

作者：

Huch, Sebastian ^{[1
]}

Sauerbeck, Florian ^{[1
]}

Betz, Johannes ^{[2
]}

机构：

[1] Tech Univ Munich, Sch Engn & Design, Inst Automot Technol, D-85748 Garching, Germany

[2] Tech Univ Munich, Professorship Autonomous Vehicle Syst, D-85748 Garching, Germany

来源：

2023 IEEE INTELLIGENT VEHICLES SYMPOSIUM, IV | 2023年

关键词：

D O I：

10.1109/IV55152.2023.10186768

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Autonomous vehicles demand high accuracy and robustness of perception algorithms. To develop efficient and scalable perception algorithms, the maximum information should be extracted from the available sensor data. In this work, we present our concept for an end-to-end perception architecture, named DeepSTEP. The deep learning-based architecture processes raw sensor data from the camera, LiDAR, and RaDAR, and combines the extracted data in a deep fusion network. The output of this deep fusion network is a shared feature space, which is used by perception head networks to fulfill several perception tasks, such as object detection or local mapping. DeepSTEP incorporates multiple ideas to advance state of the art: First, combining detection and localization into a single pipeline allows for efficient processing to reduce computational overhead and further improves overall performance. Second, the architecture leverages the temporal domain by using a self-attention mechanism that focuses on the most important features. We believe that our concept of DeepSTEP will advance the development of end-to-end perception systems. The network will be deployed on our research vehicle, which will be used as a platform for data collection, real-world testing, and validation. In conclusion, DeepSTEP represents a significant advancement in the field of perception for autonomous vehicles. The architecture's end-to-end design, time-aware attention mechanism, and integration of multiple perception tasks make it a promising solution for real-world deployment. This research is a work in progress and presents the first concept of establishing a novel perception pipeline.

引用

页数：8

共 49 条

[1] Deep learning-based robust positioning for all-weather autonomous driving
Almalioglu, Yasin
Turan, Mehmet
Trigoni, Niki
Markham, Andrew
[J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (09) : 749 - +
[2] BETZ J, 2019, PROC-SPR VIEWEG, P123, DOI DOI 10.1007/978-3-658-22050-1_12
[3] Brazil G., 2019, M3d-rpn: Monocular 3d region proposal network for object detection
[4] Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
[5] Carlson A, 2022, Arxiv, DOI arXiv:2209.01194
[6] Cartillier V, 2021, AAAI CONF ARTIF INTE, V35, P964
[7] Depth-supervised NeRF: Fewer Views and Faster Training for Free
Deng, Kangle
Liu, Andrew
Zhu, Jun-Yan
Ramanan, Deva
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12872 - 12881
[8] Deep Learning Sensor Fusion for Autonomous Vehicle Perception and Localization: A Review
Fayyad, Jamil
Jaradat, Mohammad A.
Gruyer, Dominique
Najjaran, Homayoun
[J]. SENSORS, 2020, 20 (15) : 1 - 34
[9] Vision meets robotics: The KITTI dataset
Geiger, A.
Lenz, P.
Stiller, C.
Urtasun, R.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237
[10] Gong S, 2022, Arxiv, DOI arXiv:2204.07733

← 1 2 3 4 5 →