Fully Sparse Fusion for 3D Object Detection

被引:7
|
作者
Li, Yingyan [1 ,2 ]
Fan, Lue [1 ,2 ]
Liu, Yang [1 ]
Huang, Zehao [3 ]
Chen, Yuntao [4 ]
Wang, Naiyan [3 ]
Zhang, Zhaoxiang [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci CASIA, Inst Automat, Ctr Researchon Intelligent Percept & Comp CRIPAC, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci UCAS, Sch Future Technol, Beijing 100049, Peoples R China
[3] TuSimple, Beijing 100020, Peoples R China
[4] Chinese Acad Sci HKISICAS, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Three-dimensional displays; Feature extraction; Laser radar; Cameras; Detectors; Instance segmentation; Point cloud compression; 3D object detection; multi-sensor fusion; fully sparse architecture; autonomous driving; long-range perception;
D O I
10.1109/TPAMI.2024.3392303
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7x faster than that of other state-of-the-art multimodal 3D detection methods.
引用
收藏
页码:7217 / 7231
页数:15
相关论文
共 50 条
  • [21] Sparse Activation Maps for Interpreting 3D Object Detection
    Chen, Qiuxiao
    Li, Pengfei
    Xu, Meng
    Qi, Xiaojun
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 76 - 84
  • [22] Sparse Sensor Fusion for 3D Object Detection with Symmetry-Aware Colored Point Clouds
    Wang, Lele
    Zhang, Peng
    Li, Ming
    Zhang, Faming
    SYMMETRY-BASEL, 2024, 16 (12):
  • [23] SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection
    Zhang, Hongcheng
    Liang, Liu
    Zeng, Pengxin
    Song, Xiao
    Wang, Zhe
    COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 109 - 128
  • [24] Towards Raw Sensor Fusion in 3D Object Detection
    Rovid, Andras
    Remeli, Viktor
    2019 IEEE 17TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2019), 2019, : 293 - 298
  • [25] Spatial Pruned Sparse Convolution for Efficient 3D Object Detection
    Liu, Jianhui
    Chen, Yukang
    Ye, Xiaoqing
    Tian, Zhuotao
    Tan, Xiao
    Qi, Xiaojuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [26] Sparse2Dense: Learning to Densify 3D Features for 3D Object Detection
    Wang, Tianyu
    Hu, Xiaowei
    Liu, Zhengzhe
    Fu, Chi-Wing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [27] Fully Sparse 3D Occupancy Prediction
    Liu, Haisong
    Chen, Yang
    Wang, Haiguang
    Yang, Zetong
    Li, Tianyu
    Zeng, Jia
    Chen, Li
    Li, Hongyang
    Wang, Limin
    COMPUTER VISION - ECCV 2024, PT XXV, 2025, 15083 : 54 - 71
  • [28] CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection
    Pang, Su
    Morris, Daniel
    Radha, Hayder
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 10386 - 10393
  • [29] 3D object detection based on sparse convolution neural network and feature fusion for autonomous driving in smart cities
    Wang, Lei
    Fan, Xiaoyun
    Chen, Jiahao
    Cheng, Jun
    Tan, Jun
    Ma, Xiaoliang
    SUSTAINABLE CITIES AND SOCIETY, 2020, 54 (54)
  • [30] Multi-feature Fusion VoteNet for 3D Object Detection
    Wang, Zhoutao
    Xie, Qian
    Wei, Mingqiang
    Long, Kun
    Wang, Jun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (01)