Fully Sparse Fusion for 3D Object Detection

被引:7
|
作者
Li, Yingyan [1 ,2 ]
Fan, Lue [1 ,2 ]
Liu, Yang [1 ]
Huang, Zehao [3 ]
Chen, Yuntao [4 ]
Wang, Naiyan [3 ]
Zhang, Zhaoxiang [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci CASIA, Inst Automat, Ctr Researchon Intelligent Percept & Comp CRIPAC, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci UCAS, Sch Future Technol, Beijing 100049, Peoples R China
[3] TuSimple, Beijing 100020, Peoples R China
[4] Chinese Acad Sci HKISICAS, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Three-dimensional displays; Feature extraction; Laser radar; Cameras; Detectors; Instance segmentation; Point cloud compression; 3D object detection; multi-sensor fusion; fully sparse architecture; autonomous driving; long-range perception;
D O I
10.1109/TPAMI.2024.3392303
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7x faster than that of other state-of-the-art multimodal 3D detection methods.
引用
收藏
页码:7217 / 7231
页数:15
相关论文
共 50 条
  • [1] Fully Sparse 3D Object Detection
    Fan, Lue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [2] Sparse Dense Fusion for 3D Object Detection
    Gao, Yulu
    Sima, Chonghao
    Shi, Shaoshuai
    Di, Shangzhe
    Liu, Si
    Li, Hongyang
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 10939 - 10946
  • [3] VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
    Chen, Yukang
    Liu, Jianhui
    Zhang, Xiangyu
    Qi, Xiaojuan
    Jia, Jiaya
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21674 - 21683
  • [4] SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection
    Zhang, Gang
    Chen, Junnan
    Gao, Guohuan
    Li, Jianmin
    Liu, Si
    Hu, Xiaolin
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 14477 - 14486
  • [5] SRFDet3D: Sparse Region Fusion based 3D Object Detection
    Erabati, Gopi Krishna
    Araujo, Helder
    NEUROCOMPUTING, 2024, 593
  • [6] Super Sparse 3D Object Detection
    Fan, Lue
    Yang, Yuxue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12490 - 12505
  • [7] VPSNet: 3D object detection with voxel purification and fully sparse convolutional networks
    Wen, Jia
    Zhang, Qi
    Zhang, Guanghao
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (03):
  • [8] STFNET: Sparse Temporal Fusion for 3D Object Detection in LiDAR Point Cloud
    Meng, Xin
    Zhou, Yuan
    Ma, Jun
    Jiang, Fangdi
    Qi, Yongze
    Wang, Cui
    Kim, Jonghyuk
    Wang, Shifeng
    IEEE SENSORS JOURNAL, 2025, 25 (03) : 5866 - 5877
  • [9] DROP SPARSE CONVOLUTION FOR 3D OBJECT DETECTION
    Zhu, Taohong
    Shen, Jun
    Wang, Chali
    Xiong, Huiyuan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3185 - 3189
  • [10] FSD V2: Improving Fully Sparse 3D Object Detection With Virtual Voxels
    Fan, Lue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 1279 - 1292