Autonomous driving enhanced: a fusion framework integrating LiDAR point clouds with monovision depth-aware transformers for robust object detection

被引:0
|
作者
Liu, Hui [1 ]
Su, Tong [2 ]
Guo, Jing [1 ]
机构
[1] Suzhou Vocat Inst Ind Technol, Sch Automot Engn, Suzhou 215104, Peoples R China
[2] Shanghai Lixin Univ Accounting & Finance, Sch Informat Management, Shanghai 201209, Peoples R China
来源
ENGINEERING RESEARCH EXPRESS | 2025年 / 7卷 / 01期
关键词
autonomous driving perception; depth-aware transformer (DAT); LiDAR point clouds; monocular depth estimation; adaptive fusion strategy; dilated convolution;
D O I
10.1088/2631-8695/ada7c7
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In the evolving landscape of autonomous driving technology, the ability to accurately detect and localize objects in complex environments is paramount. This paper introduces an innovative object detection algorithm designed to enhance the perception capabilities of autonomous vehicles. We propose a novel fusion framework that integrates LiDAR point clouds with monocular depth estimations, utilizing a Depth-Aware Transformer (DAT) architecture. The DAT, a recent advancement in transformer models, is uniquely equipped to handle spatial hierarchies and depth cues, making it ideal for interpreting three-dimensional scenes from two-dimensional images. Our approach leverages the complementary strengths of LiDAR and monocular vision, where LiDAR provides precise depth information while the monocular camera offers rich visual textures and color information. The adaptive fusion strategy dynamically adjusts the weight given to each sensor modality based on the reliability and quality of the data in real-time, ensuring optimal performance under varying environmental conditions. We validate our method using the extensive KITTI dataset, a benchmark in autonomous driving research. Extensive experiments demonstrate that our algorithm outperforms state-of-the-art object detection models, achieving higher accuracy in object localization and classification. Moreover, our solution showcases improved robustness and generalization across diverse driving environments, thanks to the enhanced depth perception enabled by the DAT architecture. To further validate the effectiveness of our model, we conducted both comparative and ablation experiments, which confirmed the performance improvements of our approach and demonstrated the critical contributions of the DAT and Adaptive Fusion components. The proposed fusion of LiDAR and monocular depth estimation using Depth-Aware Transformers represents a significant step forward in autonomous driving perception systems. It not only advances the field of object detection but also paves the way for more sophisticated applications in autonomous navigation, where a deep understanding of the environment is crucial for safe and efficient operation.
引用
收藏
页数:16
相关论文
共 43 条
  • [21] Point-Level Fusion and Channel Attention for 3D Object Detection in Autonomous Driving
    Shen, Juntao
    Fang, Zheng
    Huang, Jin
    SENSORS, 2025, 25 (04)
  • [22] Sparse Sensor Fusion for 3D Object Detection with Symmetry-Aware Colored Point Clouds
    Wang, Lele
    Zhang, Peng
    Li, Ming
    Zhang, Faming
    SYMMETRY-BASEL, 2024, 16 (12):
  • [23] CrossFusion net: Deep 3D object detection based on RGB images and point clouds in autonomous driving
    Hong, Dza-Shiang
    Chen, Hung-Hao
    Hsiao, Pei-Yung
    Fu, Li-Chen
    Siao, Siang-Min
    IMAGE AND VISION COMPUTING, 2020, 100
  • [24] End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
    Zhou, Yin
    Sun, Pei
    Zhang, Yu
    Anguelov, Dragomir
    Gao, Jiyang
    Ouyang, Tom
    Guo, James
    Ngiam, Jiquan
    Vasudevan, Vijay
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [25] Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
    Wang, Yan
    Chao, Wei-Lun
    Garg, Divyansh
    Hariharan, Bharath
    Campbell, Mark
    Weinberger, Kilian Q.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8437 - 8445
  • [26] FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving
    Rashed, Hazem
    Ramzy, Mohamed
    Vaquero, Victor
    El Sallab, Ahmad
    Sistu, Ganesh
    Yogamani, Senthil
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2393 - 2402
  • [27] FS-Net: LiDAR-Camera Fusion With Matched Scale for 3D Object Detection in Autonomous Driving
    Zhang, Lei
    Li, Xu
    Tang, Kaichen
    Jiang, Yunzhe
    Yang, Liu
    Zhang, Yonggang
    Chen, Xianyi
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 12154 - 12165
  • [28] PEPillar: a point-enhanced pillar network for efficient 3D object detection in autonomous driving
    Sun, Libo
    Li, Yifan
    Qin, Wenhu
    VISUAL COMPUTER, 2025, 41 (03): : 1777 - 1788
  • [29] A New Density-Based Clustering Method Considering Spatial Distribution of Lidar Point Cloud for Object Detection of Autonomous Driving
    Li, Caihong
    Gao, Feng
    Han, Xiangyu
    Zhang, Bowen
    ELECTRONICS, 2021, 10 (16)
  • [30] Advancing autonomous SLAM systems: Integrating YOLO object detection and enhanced loop closure techniques for robust environment mapping
    Ul Islam, Qamar
    Khozaei, Fatemeh
    Barhoumi, El Manaa Salah Al
    Baig, Imran
    Ignatyev, Dmitry
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2025, 185