TM2B: Transformer-Based Motion-to-Box Network for 3D Single Object Tracking on Point Clouds

被引：0

作者：

Xu, Anqi ^{[1
]}

Nie, Jiahao ^{[1
]}

He, Zhiwei ^{[1
]}

Lv, Xudong ^{[1
]}

机构：

[1] Sch Hangzhou Dianzi Univ, Hangzhou 310018, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 08期

关键词：

Transformers; Accuracy; Three-dimensional displays; Target tracking; Object tracking; Feature extraction; Point cloud compression; 3D single object tracking; motion-to-box; transformer;

D O I：

10.1109/LRA.2024.3418274

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

3D single object tracking plays a crucial role in numerous applications such as autonomous driving. Recent trackers based on motion-centric paradigm perform well as they exploit motion cues to infer target relative motion across successive frames, which effectively overcome significant appearance variations of targets and distractors caused by occlusion. However, such a motion-centric paradigm tends to require multi-stage motion-to-box to refine the motion cues, which suffers from tedious hyper-parameter tuning and elaborate subtask designs. In this letter, we propose a novel transformer-based motion-to-box network (TM2B), which employs a learnable relation modeling transformer (LRMT) to generate accurate motion cues without multi-stage refinements. Our proposed LRMT contains two novel attention mechanisms: hierarchical interactive attention and learnable query attention. The former attention builds a learnable number-fixed sampling sets for each query on multi-scale feature maps, enabling each query to adaptively select prominent sampling elements, thus effectively encoding multi-scale features in a lightweight manner, while the latter calculates the weighted sum of the encoded features with learnable global query, enabling to extract valuable motion cues from all available features, thereby achieving accurate object tracking. Extensive experiments demonstrate that TM2B achieves state-of-the-art performance on KITTI, NuScenes and Waymo Open Dataset, while obtaining a significant improvement in inference speed over previous leading methods, achieving 56.8 FPS on a single NVIDIA 1080Ti GPU. The code is available at TM2B.

引用

页码：7078 / 7085

页数：8

共 50 条

[21] SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
Sun, Pei
Tan, Mingxing
Wang, Weiyue
Liu, Chenxi
Xia, Fei
Leng, Zhaoqi
Anguelov, Dragomir
COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 426 - 442
[22] Transformer-Based Global PointPillars 3D Object Detection Method
Zhang, Lin
Meng, Hua
Yan, Yunbing
Xu, Xiaowei
ELECTRONICS, 2023, 12 (14)
[23] 3DPPE: 3D Point Positional Encoding for Transformer-based Multi-Camera 3D Object Detection
Shu, Changyong
Deng, Jiajun
Yu, Fisher
Liu, Yifan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3557 - 3566
[24] Point Siamese Network for Person Tracking Using 3D Point Clouds
Cui, Yubo
Fang, Zheng
Zhou, Sifan
SENSORS, 2020, 20 (01)
[25] Real-Time Object Tracking in Sparse Point Clouds based on 3D Interpolation
Lee, Yeon-Jun
Seo, Seung-Woo
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 4804 - 4811
[26] Sewer defect detection from 3D point clouds using a transformer-based deep learning model
Zhou, Yunxiang
Ji, Ankang
Zhang, Limao
AUTOMATION IN CONSTRUCTION, 2022, 136
[27] Optimisation of the PointPillars network for 3D object detection in point clouds
Stanisz, Joanna
Lis, Konrad
Kryjak, Tomasz
Gorgon, Marek
2020 SIGNAL PROCESSING - ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2020, : 122 - 127
[28] Learning Deformable Network for 3D Object Detection on Point Clouds
Zhang, Wanyi
Fu, Xiuhua
Li, Wei
MOBILE INFORMATION SYSTEMS, 2021, 2021
[29] Relation Graph Network for 3D Object Detection in Point Clouds
Feng, Mingtao
Gilani, Syed Zulqarnain
Wang, Yaonan
Zhang, Liang
Mian, Ajmal
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 92 - 107
[30] Enhanced Vote Network for 3D Object Detection in Point Clouds
Zhong, Min
Zeng, Gang
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6624 - 6631

← 1 2 3 4 5 →