3D-MAN: 3D Multi-frame Attention Network for Object Detection

被引:59
|
作者
Yang, Zetong [1 ]
Zhou, Yin [2 ]
Chen, Zhifeng [3 ]
Ngiam, Jiquan [3 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Waymo LLC, Mountain View, CA USA
[3] Google Res, Brain Team, Mountain View, CA USA
关键词
D O I
10.1109/CVPR46437.2021.00190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is an important module in autonomous driving and robotics. However, many existing methods focus on using single frames to perform 3D detection, and do not fully utilize information from multiple frames. In this paper, we present 3D-MAN: a 3D multi-frame attention network that effectively aggregates features from multiple perspectives and achieves state-of-the-art performance on Waymo Open Dataset. 3D-MAN first uses a novel fast single-frame detector to produce box proposals. The box proposals and their corresponding feature maps are then stored in a memory bank. We design a multi-view alignment and aggregation module, using attention networks, to extract and aggregate the temporal features stored in the memory bank. This effectively combines the features coming from different perspectives of the scene. We demonstrate the effectiveness of our approach on the large-scale complex Waymo Open Dataset, achieving state-of-the-art results compared to published single-frame and multi-frame methods.
引用
收藏
页码:1863 / 1872
页数:10
相关论文
共 50 条
  • [41] SPGroup3D: Superpoint Grouping Network for Indoor 3D Object Detection
    Zhu, Yun
    Hui, Le
    Shen, Yaqi
    Xie, Jin
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7811 - 7819
  • [42] AMVFNet: Attentive Multi-View Fusion Network for 3D Object Detection
    Huang, Yuxiao
    Huang, Zhicong
    Zhao, Jingwen
    Hu, Haifeng
    Chen, Dihu
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (01)
  • [43] Deformable Feature Fusion Network for Multi-Modal 3D Object Detection
    Guo, Kun
    Gan, Tong
    Ding, Zhao
    Ling, Qiang
    2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024, 2024, : 363 - 367
  • [44] MA3D: A Multi-Attention-based Complex 3D Object Detection from Point Cloud Data
    Liao, Lyuchao
    Feng, Zhicheng
    Huang, Dejuan
    Zhu, Yintian
    Lin, Jinmei
    Luo, Linsen
    Journal of Network Intelligence, 2022, 7 (03): : 719 - 733
  • [45] A robust 3D unique descriptor for 3D object detection
    Joshi, Piyush
    Rastegarpanah, Alireza
    Stolkin, Rustam
    PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (03)
  • [46] Investigating Attention Mechanism in 3D Point Cloud Object Detection
    Qiu, Shi
    Wu, Yunfan
    Anwar, Saeed
    Li, Chongyi
    2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 403 - 412
  • [47] Attention-based Proposals Refinement for 3D Object Detection
    Minh-Quan Dao
    Hery, Elwan
    Fremont, Vincent
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 197 - 205
  • [48] 3D Object Detection with Attention: Shell-Based Modeling
    Zhang X.
    Zhao Z.
    Sun W.
    Cui Q.
    Computer Systems Science and Engineering, 2023, 46 (01): : 537 - 550
  • [49] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [50] 3D Lane Detection With Attention in Attention
    Gu, Yinchao
    Ma, Chao
    Li, Qian
    Yang, Xiaokang
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 (1104-1108) : 1104 - 1108