Autonomous driving enhanced: a fusion framework integrating LiDAR point clouds with monovision depth-aware transformers for robust object detection

被引：0

作者：

Liu, Hui ^{[1
]}

Su, Tong ^{[2
]}

Guo, Jing ^{[1
]}

机构：

[1] Suzhou Vocat Inst Ind Technol, Sch Automot Engn, Suzhou 215104, Peoples R China

[2] Shanghai Lixin Univ Accounting & Finance, Sch Informat Management, Shanghai 201209, Peoples R China

来源：

ENGINEERING RESEARCH EXPRESS | 2025年 / 7卷 / 01期

关键词：

autonomous driving perception; depth-aware transformer (DAT); LiDAR point clouds; monocular depth estimation; adaptive fusion strategy; dilated convolution;

D O I：

10.1088/2631-8695/ada7c7

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

In the evolving landscape of autonomous driving technology, the ability to accurately detect and localize objects in complex environments is paramount. This paper introduces an innovative object detection algorithm designed to enhance the perception capabilities of autonomous vehicles. We propose a novel fusion framework that integrates LiDAR point clouds with monocular depth estimations, utilizing a Depth-Aware Transformer (DAT) architecture. The DAT, a recent advancement in transformer models, is uniquely equipped to handle spatial hierarchies and depth cues, making it ideal for interpreting three-dimensional scenes from two-dimensional images. Our approach leverages the complementary strengths of LiDAR and monocular vision, where LiDAR provides precise depth information while the monocular camera offers rich visual textures and color information. The adaptive fusion strategy dynamically adjusts the weight given to each sensor modality based on the reliability and quality of the data in real-time, ensuring optimal performance under varying environmental conditions. We validate our method using the extensive KITTI dataset, a benchmark in autonomous driving research. Extensive experiments demonstrate that our algorithm outperforms state-of-the-art object detection models, achieving higher accuracy in object localization and classification. Moreover, our solution showcases improved robustness and generalization across diverse driving environments, thanks to the enhanced depth perception enabled by the DAT architecture. To further validate the effectiveness of our model, we conducted both comparative and ablation experiments, which confirmed the performance improvements of our approach and demonstrated the critical contributions of the DAT and Adaptive Fusion components. The proposed fusion of LiDAR and monocular depth estimation using Depth-Aware Transformers represents a significant step forward in autonomous driving perception systems. It not only advances the field of object detection but also paves the way for more sophisticated applications in autonomous navigation, where a deep understanding of the environment is crucial for safe and efficient operation.

引用

页数：16

共 43 条

[41] Feature Aware Re-weighting (FAR) in Bird's Eye View for LiDAR-based 3D object detection in autonomous driving applications
Zamanakos, Georgios
Tsochatzidis, Lazaros
Amanatiadis, Angelos
Pratikakis, Ioannis
ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 175
[42] Multi-Task Foreground-Aware Network with Depth Completion for Enhanced RGB-D Fusion Object Detection Based on Transformer
Pan, Jiasheng
Zhong, Songyi
Yue, Tao
Yin, Yankun
Tang, Yanhao
SENSORS, 2024, 24 (07)
[43] A Framework for Representing, Building and Reusing Novel State-of-the-Art Three-Dimensional Object Detection Models in Point Clouds Targeting Self-Driving Applications
Silva, Antonio Linhares
Oliveira, Pedro
Duraes, Dalila
Fernandes, Duarte
Nevoa, Rafael
Monteiro, Joao
Melo-Pinto, Pedro
Machado, Jose
Novais, Paulo
SENSORS, 2023, 23 (14)

← 1 2 3 4 5 →