Fully Sparse Fusion for 3D Object Detection

被引:7
|
作者
Li, Yingyan [1 ,2 ]
Fan, Lue [1 ,2 ]
Liu, Yang [1 ]
Huang, Zehao [3 ]
Chen, Yuntao [4 ]
Wang, Naiyan [3 ]
Zhang, Zhaoxiang [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci CASIA, Inst Automat, Ctr Researchon Intelligent Percept & Comp CRIPAC, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci UCAS, Sch Future Technol, Beijing 100049, Peoples R China
[3] TuSimple, Beijing 100020, Peoples R China
[4] Chinese Acad Sci HKISICAS, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Three-dimensional displays; Feature extraction; Laser radar; Cameras; Detectors; Instance segmentation; Point cloud compression; 3D object detection; multi-sensor fusion; fully sparse architecture; autonomous driving; long-range perception;
D O I
10.1109/TPAMI.2024.3392303
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7x faster than that of other state-of-the-art multimodal 3D detection methods.
引用
收藏
页码:7217 / 7231
页数:15
相关论文
共 50 条
  • [31] A LiDAR-Camera Fusion 3D Object Detection Algorithm
    Liu, Leyuan
    He, Jian
    Ren, Keyan
    Xiao, Zhonghua
    Hou, Yibin
    INFORMATION, 2022, 13 (04)
  • [32] Sensor Fusion for Joint 3D Object Detection and Semantic Segmentation
    Meyer, Gregory P.
    Charland, Jake
    Hegde, Darshan
    Laddha, Ankit
    Vallespi-Gonzalez, Carlos
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 1230 - 1237
  • [33] 3D Multi-object Detection and Tracking with Sparse Stationary LiDAR
    Zhang, Meng
    Pan, Zhiyu
    Feng, Jianjiang
    Zhou, Jie
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 16 - 28
  • [34] Multimodal 3D Object Detection Based on Sparse Interaction in Internet of Vehicles
    Li, Hui
    Ge, Tongao
    Bai, Keqiang
    Nie, Gaofeng
    Xu, Lingwei
    Ai, Xiaoxue
    Cao, Song
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (02) : 2174 - 2186
  • [35] SGFNet: Segmentation Guided Fusion Network for 3D Object Detection
    Wang, Yunlong
    Jiang, Kun
    Wen, Tuopu
    Jiao, Xinyu
    Wijaya, Benny
    Miao, Jinyu
    Shi, Yining
    Fu, Zheng
    Yang, Mengmeng
    Yang, Diange
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (12) : 8239 - 8246
  • [36] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [37] SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
    Sun, Pei
    Tan, Mingxing
    Wang, Weiyue
    Liu, Chenxi
    Xia, Fei
    Leng, Zhaoqi
    Anguelov, Dragomir
    COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 426 - 442
  • [38] Frame Fusion with Vehicle Motion Prediction for 3D Object Detection
    Li, Xirui
    Wang, Feng
    Wang, Naiyan
    Ma, Chao
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4252 - 4258
  • [39] Monocular-GPS Fusion 3D object detection for UAVs
    Ren, Siyuan
    Zhao, Wenjie
    Zhang, Antong
    Zhang, Bo
    Han, Bo
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [40] Bilateral-Branch Fusion Network for 3D Object Detection
    Chen, Zhiyu
    Feng, Yujian
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 4321 - 4325