TM2B: Transformer-Based Motion-to-Box Network for 3D Single Object Tracking on Point Clouds

被引:0
|
作者
Xu, Anqi [1 ]
Nie, Jiahao [1 ]
He, Zhiwei [1 ]
Lv, Xudong [1 ]
机构
[1] Sch Hangzhou Dianzi Univ, Hangzhou 310018, Peoples R China
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 08期
关键词
Transformers; Accuracy; Three-dimensional displays; Target tracking; Object tracking; Feature extraction; Point cloud compression; 3D single object tracking; motion-to-box; transformer;
D O I
10.1109/LRA.2024.3418274
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
3D single object tracking plays a crucial role in numerous applications such as autonomous driving. Recent trackers based on motion-centric paradigm perform well as they exploit motion cues to infer target relative motion across successive frames, which effectively overcome significant appearance variations of targets and distractors caused by occlusion. However, such a motion-centric paradigm tends to require multi-stage motion-to-box to refine the motion cues, which suffers from tedious hyper-parameter tuning and elaborate subtask designs. In this letter, we propose a novel transformer-based motion-to-box network (TM2B), which employs a learnable relation modeling transformer (LRMT) to generate accurate motion cues without multi-stage refinements. Our proposed LRMT contains two novel attention mechanisms: hierarchical interactive attention and learnable query attention. The former attention builds a learnable number-fixed sampling sets for each query on multi-scale feature maps, enabling each query to adaptively select prominent sampling elements, thus effectively encoding multi-scale features in a lightweight manner, while the latter calculates the weighted sum of the encoded features with learnable global query, enabling to extract valuable motion cues from all available features, thereby achieving accurate object tracking. Extensive experiments demonstrate that TM2B achieves state-of-the-art performance on KITTI, NuScenes and Waymo Open Dataset, while obtaining a significant improvement in inference speed over previous leading methods, achieving 56.8 FPS on a single NVIDIA 1080Ti GPU. The code is available at TM2B.
引用
收藏
页码:7078 / 7085
页数:8
相关论文
共 50 条
  • [11] OST: Efficient One-Stream Network for 3D Single Object Tracking in Point Clouds
    Zhao, Xiantong
    Han, Yinan
    Tian, Shengjing
    Liu, Jian
    Liu, Xiuping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 990 - 1002
  • [12] Graph-Based Point Tracker for 3D Object Tracking in Point Clouds
    Park, Minseong
    Seong, Hongje
    Jang, Wonje
    Kim, Euntai
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2053 - 2061
  • [13] GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds
    Nie, Jiahao
    He, Zhiwei
    Yang, Yuxiang
    Gao, Mingyu
    Zhang, Jing
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1957 - 1965
  • [14] 3D Single-Object Tracking in Point Clouds with High Temporal Variation
    Wu, Qiao
    Sun, Kun
    An, Pei
    Salzmann, Mathieu
    Zhang, Yanning
    Yang, Jiaqi
    COMPUTER VISION-ECCV 2024, PT VII, 2025, 15065 : 279 - 296
  • [15] Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds
    Pei, Yu
    Zhao, Xian
    Li, Hao
    Ma, Jingyuan
    Zhang, Jingwei
    Pu, Shiliang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6641 - 6650
  • [16] Weakly Supervised Point Clouds Transformer for 3D Object Detection
    Tang, Zuojin
    Sun, Bo
    Ma, Tongwei
    Li, Daosheng
    Xu, Zhenhui
    2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 3948 - 3955
  • [17] Multi-Level Structure-Enhanced Network for 3D Single Object Tracking in Sparse Point Clouds
    Wu, Qiaoyun
    Sun, Changyin
    Wang, Jun
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (01) : 9 - 16
  • [18] Particle Filter Based Object Tracking of 3D Sparse Point Clouds for Autopilot
    Du, Yu
    Wei ShangGuan
    Chai, LinGuo
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 1102 - 1107
  • [19] Factor Graph based 3D Multi-Object Tracking in Point Clouds
    Poeschmann, Johannes
    Pfeifer, Tim
    Protzel, Peter
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 10343 - 10350
  • [20] PTTR: Relational 3D Point Cloud Object Tracking with Transformer
    Zhou, Changqing
    Luo, Zhipeng
    Luo, Yueru
    Liu, Tianrui
    Pan, Liang
    Cai, Zhongang
    Zhao, Haiyu
    Lu, Shijian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8521 - 8530