ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention

被引：0

作者：

He, Chenhang ^{[1
]}

Li, Ruihuang ^{[1
,2
]}

Zhang, Guowen ^{[1
]}

Zhang, Lei ^{[1
,2
]}

机构：

[1] Hong Kong Polytech Univ, Hong Kong, Peoples R China

[2] OPPO Res, Shenzhen, Peoples R China

来源：

COMPUTER VISION - ECCV 2024, PT XXIX | 2025年 / 15087卷

关键词：

3D Object Detection; Voxel Transformer;

D O I：

10.1007/978-3-031-73397-0_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Window-based transformers excel in large-scale point cloud understanding by capturing context-aware representations with affordable attention computation in a more localized manner. However, the sparse nature of point clouds leads to a significant variance in the number of voxels per window. Existing methods group the voxels in each window into fixed-length sequences through extensive sorting and padding operations, resulting in a non-negligible computational and memory overhead. In this paper, we introduce ScatterFormer, which to the best of our knowledge, is the first to directly apply attention to voxels across different windows as a single sequence. The key of ScatterFormer is a Scattered Linear Attention (SLA) module, which leverages the pre-computation of key-value pairs in linear attention to enable parallel computation on the variable-length voxel sequences divided by windows. Leveraging the hierarchical structure of GPUs and shared memory, we propose a chunk-wise algorithm that reduces the SLA module's latency to less than 1 millisecond on moderate GPUs. Furthermore, we develop a cross-window interaction module that improves the locality and connectivity of voxel features across different windows, eliminating the need for extensive window shifting. Our proposed ScatterFormer demonstrates 73.8 mAP (L2) on the Waymo Open Dataset and 72.4 NDS on the NuScenes dataset, running at an outstanding detection rate of 23 FPS. The code is available at https://github.com/skyhehe123/ScatterFormer.

引用

页码：74 / 92

页数：19

共 50 条

[31] Transformer-like model with linear attention for speech emotion recognition
Du, Jing
Tang, Manting
Zhao, Li
Journal of Southeast University (English Edition), 2021, 37 (02): : 164 - 170
[32] An Efficient Piecewise Linear Approximation of Non-linear Operations for Transformer Inference
Lu, Haodong
Mei, Qichang
Wang, Kun
2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 206 - 206
[33] Efficient Deraining model using Transformer and Kernel Basis Attention for UAVs
Tomida, Yuto
Katayama, Takafumi
Song, Tian
Shimamoto, Takashi
2024 INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS, AND COMMUNICATIONS, ITC-CSCC 2024, 2024,
[34] Cross-Parallel Attention and Efficient Match Transformer for Aerial Tracking
Deng, Anping
Han, Guangliang
Zhang, Zhongbo
Chen, Dianbing
Ma, Tianjiao
Liu, Zhichao
REMOTE SENSING, 2024, 16 (06)
[35] Efficient convolutional dual-attention transformer for automatic modulation recognition
Yi, Zengrui
Meng, Hua
Gao, Lu
He, Zhonghang
Yang, Meng
APPLIED INTELLIGENCE, 2025, 55 (03)
[36] Efficient Diffusion Transformer with Step-Wise Dynamic Attention Mediators
Pu, Yifan
Xia, Zhuofan
Guo, Jiayi
Han, Dongchen
Li, Qixiu
Li, Duo
Yuan, Yuhui
Li, Ji
Han, Yizeng
Song, Shiji
Huang, Gao
Li, Xiu
COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 424 - 441
[37] Decomformer: Decompose Self-Attention of Transformer for Efficient Image Restoration
Lee, Eunho
Hwang, Youngbae
IEEE ACCESS, 2024, 12 : 38672 - 38684
[38] DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention
Qin, Bosheng
Li, Juncheng
Tang, Siliang
Zhuang, Yueting
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025,
[39] ScatterFormer: Locally-Invariant Scattering Transformer for Patient-Independent Multispectral Detection of Epileptiform Discharges
Zheng, Ruizhe
Li, Jun
Wang, Yi
Luo, Tian
Yu, Yuguo
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 148 - 158
[40] Efficient Linear Attention for Fast and Accurate Keypoint Matching
Suwanwimolkul, Suwichaya
Komorita, Satoshi
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 330 - 341

← 1 2 3 4 5 →