PLC-Fusion: Perspective-Based Hierarchical and Deep LiDAR Camera Fusion for 3D Object Detection in Autonomous Vehicles

被引:1
|
作者
Mushtaq, Husnain [1 ]
Deng, Xiaoheng [1 ]
Azhar, Fizza [2 ]
Ali, Mubashir [3 ]
Sherazi, Hafiz Husnain Raza [4 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Univ Chenab, Dept Comp Sci, Gujrat 50700, Pakistan
[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
[4] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, England
基金
中国国家自然科学基金;
关键词
LiDAR-camera fusion; object perspective sampling; ViT feature fusion; 3D object detection; autonomous vehicles;
D O I
10.3390/info15110739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D object detection is essential for autonomous driving, yet traditional LiDAR models often struggle with sparse point clouds. We propose perspective-aware hierarchical vision transformer-based LiDAR-camera fusion (PLC-Fusion) for 3D object detection to address this. This efficient, multi-modal 3D object detection framework integrates LiDAR and camera data for improved performance. First, our method enhances LiDAR data by projecting them onto a 2D plane, enabling the extraction of object perspective features from a probability map via the Object Perspective Sampling (OPS) module. It incorporates a lightweight perspective detector, consisting of interconnected 2D and monocular 3D sub-networks, to extract image features and generate object perspective proposals by predicting and refining top-scored 3D candidates. Second, it leverages two independent transformers-CamViT for 2D image features and LidViT for 3D point cloud features. These ViT-based representations are fused via the Cross-Fusion module for hierarchical and deep representation learning, improving performance and computational efficiency. These mechanisms enhance the utilization of semantic features in a region of interest (ROI) to obtain more representative point features, leading to a more effective fusion of information from both LiDAR and camera sources. PLC-Fusion outperforms existing methods, achieving a mean average precision (mAP) of 83.52% and 90.37% for 3D and BEV detection, respectively. Moreover, PLC-Fusion maintains a competitive inference time of 0.18 s. Our model addresses computational bottlenecks by eliminating the need for dense BEV searches and global attention mechanisms while improving detection range and precision.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] LiDAR-camera fusion: Dual transformer enhancement for 3D object detection
    Chen, Mu
    Liu, Pengfei
    Zhao, Huaici
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120
  • [22] A Frustum-based probabilistic framework for 3D object detection by fusion of LiDAR and camera data
    Gong, Zheng
    Lin, Haojia
    Zhang, Dedong
    Luo, Zhipeng
    Zelek, John
    Chen, Yiping
    Nurunnabi, Abdul
    Wang, Cheng
    Li, Jonathan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 159 : 90 - 100
  • [23] FS-Net: LiDAR-Camera Fusion With Matched Scale for 3D Object Detection in Autonomous Driving
    Zhang, Lei
    Li, Xu
    Tang, Kaichen
    Jiang, Yunzhe
    Yang, Liu
    Zhang, Yonggang
    Chen, Xianyi
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 12154 - 12165
  • [24] Multimodal Object Detection and Ranging Based on Camera and Lidar Sensor Fusion for Autonomous Driving
    Khan, Danish
    Baek, Minjin
    Kim, Min Young
    Han, Dong Seog
    2022 27TH ASIA PACIFIC CONFERENCE ON COMMUNICATIONS (APCC 2022): CREATING INNOVATIVE COMMUNICATION TECHNOLOGIES FOR POST-PANDEMIC ERA, 2022, : 342 - 343
  • [25] Fusion of an RGB camera and LiDAR sensor through a Graph CNN for 3D object detection
    Choi, Jinsol
    Shin, Minwoo
    Paik, Joonki
    OPTICS CONTINUUM, 2023, 2 (05): : 1166 - 1179
  • [26] FusionRCNN: LiDAR-Camera Fusion for Two-Stage 3D Object Detection
    Xu, Xinli
    Dong, Shaocong
    Xu, Tingfa
    Ding, Lihe
    Wang, Jie
    Jiang, Peng
    Song, Liqiang
    Li, Jianan
    REMOTE SENSING, 2023, 15 (07)
  • [27] FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection
    Yin, Zixuan
    Sun, Han
    Liu, Ningzhong
    Zhou, Huiyu
    Shen, Jiaquan
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 505 - 517
  • [28] LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection
    Song, Jingyu
    Zhao, Lingjun
    Skinner, Katherine A.
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 18250 - 18257
  • [29] Deep Learning-based 3D Object Detection Using LiDAR and Image Data Fusion
    Bharadhwaj, Bizzam Murali
    Nair, Binoy B.
    2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [30] Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection
    Pang, Su
    Morris, Daniel
    Radha, Hayder
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3747 - 3756