3D-MAN: 3D Multi-frame Attention Network for Object Detection

被引:59
|
作者
Yang, Zetong [1 ]
Zhou, Yin [2 ]
Chen, Zhifeng [3 ]
Ngiam, Jiquan [3 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Waymo LLC, Mountain View, CA USA
[3] Google Res, Brain Team, Mountain View, CA USA
关键词
D O I
10.1109/CVPR46437.2021.00190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is an important module in autonomous driving and robotics. However, many existing methods focus on using single frames to perform 3D detection, and do not fully utilize information from multiple frames. In this paper, we present 3D-MAN: a 3D multi-frame attention network that effectively aggregates features from multiple perspectives and achieves state-of-the-art performance on Waymo Open Dataset. 3D-MAN first uses a novel fast single-frame detector to produce box proposals. The box proposals and their corresponding feature maps are then stored in a memory bank. We design a multi-view alignment and aggregation module, using attention networks, to extract and aggregate the temporal features stored in the memory bank. This effectively combines the features coming from different perspectives of the scene. We demonstrate the effectiveness of our approach on the large-scale complex Waymo Open Dataset, achieving state-of-the-art results compared to published single-frame and multi-frame methods.
引用
收藏
页码:1863 / 1872
页数:10
相关论文
共 50 条
  • [31] PointGAT: Graph attention networks for 3D object detection
    Zhou H.
    Wang W.
    Liu G.
    Zhou Q.
    Intelligent and Converged Networks, 2022, 3 (02): : 204 - 216
  • [32] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    Neurocomputing, 2021, 437 : 107 - 117
  • [33] A Fast and Robust Framework for 3D/2D Model to Multi-Frame Fluoroscopy Registration
    Saadat, Shabnam
    Asikuzzaman, Md.
    Pickering, Mark R.
    Perriman, Diana M.
    Scarvell, Jennie M.
    Smith, Paul N.
    IEEE ACCESS, 2021, 9 : 134223 - 134239
  • [34] KDA3D: Key-Point Densification and Multi-Attention Guidance for 3D Object Detection
    Wang, Jiarong
    Zhu, Ming
    Wang, Bo
    Sun, Deyao
    Wei, Hua
    Liu, Changji
    Nie, Haitao
    REMOTE SENSING, 2020, 12 (11)
  • [35] TFEdet: Efficient Multi-Frame 3D Object Detector via Proposal-Centric Temporal Feature Extraction
    Kim, Jongho
    Sagong, Sungpyo
    Yi, Kyongsu
    IEEE ACCESS, 2024, 12 : 154526 - 154534
  • [36] PillarDAN: Pillar-based Dual Attention Attention Network for 3D Object Detection with 4D RaDAR
    Li, Jingzhong
    Yang, Lin
    Chen, Yuxuan
    Yang, Yixin
    Jin, Yue
    Akiyama, Kuanta
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 1851 - 1857
  • [37] Frame Fusion with Vehicle Motion Prediction for 3D Object Detection
    Li, Xirui
    Wang, Feng
    Wang, Naiyan
    Ma, Chao
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4252 - 4258
  • [38] Single and multi-frame auto-calibration for 3D endoscopy with differential rendering
    Furukawa, Ryo
    Sagawa, Ryusulce
    Oka, Shiro
    Tanaka, Shinji
    Kawasaki, Hiroshi
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [39] 3D Object Detection Based on Sparse Self-Attention Graph Neural Network
    Peng, Zhichen
    Feng, Ansong
    Wang, Tianzhu
    Shao, Xinzhe
    Ku, Tao
    Computer Engineering and Applications, 61 (03): : 295 - 305
  • [40] ASPVNet: Attention Based Sparse Point-Voxel Network for 3D Object Detection
    Yu, Bingxin
    Wang, Lu
    He, Yuhong
    Wang, Xiaoyang
    Cheng, Jun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT X, 2025, 15040 : 161 - 176