Autonomous driving enhanced: a fusion framework integrating LiDAR point clouds with monovision depth-aware transformers for robust object detection

被引:0
|
作者
Liu, Hui [1 ]
Su, Tong [2 ]
Guo, Jing [1 ]
机构
[1] Suzhou Vocat Inst Ind Technol, Sch Automot Engn, Suzhou 215104, Peoples R China
[2] Shanghai Lixin Univ Accounting & Finance, Sch Informat Management, Shanghai 201209, Peoples R China
来源
ENGINEERING RESEARCH EXPRESS | 2025年 / 7卷 / 01期
关键词
autonomous driving perception; depth-aware transformer (DAT); LiDAR point clouds; monocular depth estimation; adaptive fusion strategy; dilated convolution;
D O I
10.1088/2631-8695/ada7c7
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In the evolving landscape of autonomous driving technology, the ability to accurately detect and localize objects in complex environments is paramount. This paper introduces an innovative object detection algorithm designed to enhance the perception capabilities of autonomous vehicles. We propose a novel fusion framework that integrates LiDAR point clouds with monocular depth estimations, utilizing a Depth-Aware Transformer (DAT) architecture. The DAT, a recent advancement in transformer models, is uniquely equipped to handle spatial hierarchies and depth cues, making it ideal for interpreting three-dimensional scenes from two-dimensional images. Our approach leverages the complementary strengths of LiDAR and monocular vision, where LiDAR provides precise depth information while the monocular camera offers rich visual textures and color information. The adaptive fusion strategy dynamically adjusts the weight given to each sensor modality based on the reliability and quality of the data in real-time, ensuring optimal performance under varying environmental conditions. We validate our method using the extensive KITTI dataset, a benchmark in autonomous driving research. Extensive experiments demonstrate that our algorithm outperforms state-of-the-art object detection models, achieving higher accuracy in object localization and classification. Moreover, our solution showcases improved robustness and generalization across diverse driving environments, thanks to the enhanced depth perception enabled by the DAT architecture. To further validate the effectiveness of our model, we conducted both comparative and ablation experiments, which confirmed the performance improvements of our approach and demonstrated the critical contributions of the DAT and Adaptive Fusion components. The proposed fusion of LiDAR and monocular depth estimation using Depth-Aware Transformers represents a significant step forward in autonomous driving perception systems. It not only advances the field of object detection but also paves the way for more sophisticated applications in autonomous navigation, where a deep understanding of the environment is crucial for safe and efficient operation.
引用
收藏
页数:16
相关论文
共 43 条
  • [31] Lidar Object Perception Framework for Urban Autonomous Driving: Detection and State Tracking Based on Convolutional Gated Recurrent Unit and Statistical Approach
    Kim, Jongho
    Yi, Kyongsu
    IEEE VEHICULAR TECHNOLOGY MAGAZINE, 2023, 18 (02): : 60 - 68
  • [32] EmPointMovSeg: Sparse Tensor-Based Moving-Object Segmentation in 3-D LiDAR Point Clouds for Autonomous Driving-Embedded System
    He, Zhijian
    Fan, Xueli
    Peng, Yun
    Shen, Zhaoyan
    Jiao, Jianhao
    Liu, Ming
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (01) : 41 - 53
  • [33] MENet: Map-enhanced 3D object detection in bird's-eye view for LiDAR point clouds
    Huang, Yuanxian
    Zhou, Jian
    Li, Xicheng
    Dong, Zhen
    Xiao, Jinsheng
    Wang, Shurui
    Zhang, Hongjuan
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 120
  • [34] Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation
    Deng, Jianning
    Chan, Gabriel
    Zhong, Hantao
    Lu, Chris Xiaoxuan
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 6585 - 6591
  • [35] PMPF: Point-Cloud Multiple-Pixel Fusion-Based 3D Object Detection for Autonomous Driving
    Zhang, Yan
    Liu, Kang
    Bao, Hong
    Zheng, Ying
    Yang, Yi
    REMOTE SENSING, 2023, 15 (06)
  • [36] SIEV-Net: A Structure-Information Enhanced Voxel Network for 3D Object Detection From LiDAR Point Clouds
    Yu, Chuanbo
    Lei, Jianjun
    Peng, Bo
    Shen, Haifeng
    Huang, Qingming
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [37] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
  • [38] Up-Sampling Method for Low-Resolution LiDAR Point Cloud to Enhance 3D Object Detection in an Autonomous Driving Environment
    You, Jihwan
    Kim, Young-Keun
    SENSORS, 2023, 23 (01)
  • [39] BEVDetNet: Bird's Eye View LiDAR Point Cloud based Real-time 3D Object Detection for Autonomous Driving
    Mohapatra, Sambit
    Yogamani, Senthil
    Gotzig, Heinrich
    Milz, Stefan
    Maeder, Patrick
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2809 - 2815
  • [40] BRTPillar: boosting real-time 3D object detection based point cloud and RGB image fusion in autonomous driving
    Zhang, Zhitian
    Zhao, Hongdong
    Zhao, Yazhou
    Chen, Dan
    Zhang, Ke
    Li, Yanqi
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2025, 18 (01) : 217 - 235