CenterFormer: Center-Based Transformer for 3D Object Detection

被引:61
|
作者
Zhou, Zixiang [1 ,2 ]
Zhao, Xiangchen [1 ]
Wang, Yu [1 ]
Wang, Panqu [1 ]
Foroosh, Hassan [2 ]
机构
[1] TuSimple, San Diego, CA 92122 USA
[2] Univ Cent Florida, Computat Imaging Lab, Orlando, FL 32816 USA
来源
关键词
LiDAR point cloud; 3D object detection; Transformer; Multi-frame fusion;
D O I
10.1007/978-3-031-19839-7_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Query-based transformer has shown great potential in constructing long-range attention in many image-domain tasks, but has rarely been considered in LiDAR-based 3D object detection due to the overwhelming size of the point cloud data. In this paper, we propose CenterFormer, a center-based transformer network for 3D object detection. CenterFormer first uses a center heatmap to select center candidates on top of a standard voxel-based point cloud encoder. It then uses the feature of the center candidate as the query embedding in the transformer. To further aggregate features from multiple frames, we design an approach to fuse features through cross-attention. Lastly, regression heads are added to predict the bounding box on the output center feature representation. Our design reduces the convergence difficulty and computational complexity of the transformer structure. The results show significant improvements over the strong baseline of anchor-free object detection networks. CenterFormer achieves state-of-the-art performance for a single model on the Waymo Open Dataset, with 73.7% mAPH on the validation set and 75.6% mAPH on the test set, significantly outperforming all previously published CNN and transformer-based methods. Our code is publicly available at https://github.com/TuSimple/centerformer
引用
收藏
页码:496 / 513
页数:18
相关论文
共 50 条
  • [1] Center-based 3D Object Detection and Tracking
    Yin, Tianwei
    Zhou, Xingyi
    Krahenbuhl, Philipp
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11779 - 11788
  • [2] CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection
    Nabati, Ramin
    Qi, Hairong
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1526 - 1535
  • [3] Scalable 3D Object Detection Pipeline With Center-Based Sequential Feature Aggregation for Intelligent Vehicles
    Jiang, Qi
    Hu, Chuan
    Zhao, Baixuan
    Huang, Yonghui
    Zhang, Xi
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1512 - 1523
  • [4] Voxel Transformer for 3D Object Detection
    Mao, Jiageng
    Xue, Yujing
    Niu, Minzhe
    Bai, Haoyue
    Feng, Jiashi
    Liang, Xiaodan
    Xu, Hang
    Xu, Chunjing
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3144 - 3153
  • [5] 3D point cloud object detection algorithm based on Transformer
    Liu M.
    Yang Q.
    Hu G.
    Guo Y.
    Zhang J.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2023, 41 (06): : 1190 - 1197
  • [6] OcTr: Octree-based Transformer for 3D Object Detection
    Zhou, Chao
    Zhang, Yanan
    Chen, Jiaxin
    Huang, Di
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5166 - 5175
  • [7] CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object Tracking
    Nabati, Ramin
    Harris, Landon
    Qi, Hairong
    2021 IEEE INTELLIGENT VEHICLES SYMPOSIUM WORKSHOPS (IV WORKSHOPS), 2021, : 243 - 248
  • [8] Dynamic graph transformer for 3D object detection
    Ren, Siyuan
    Pan, Xiao
    Zhao, Wenjie
    Nie, Binling
    Han, Bo
    KNOWLEDGE-BASED SYSTEMS, 2023, 259
  • [9] CenterCoop: Center-Based Feature Aggregation for Communication-Efficient Vehicle-Infrastructure Cooperative 3D Object Detection
    Zhou, Linyi
    Gan, Zhongxue
    Fan, Jiayuan
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3570 - 3577
  • [10] Fusion information enhanced method based on transformer for 3D object detection
    Jin Y.
    Tao C.
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2023, 44 (12): : 297 - 306