Learning Cross-Attention Point Transformer With Global Porous Sampling

被引:0
|
作者
Duan, Yueqi [1 ]
Sun, Haowen [2 ]
Yan, Juncheng [2 ]
Lu, Jiwen [2 ]
Zhou, Jie [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Global Positioning System; Convolution; Three-dimensional displays; Geometry; Feature extraction; Training data; Sun; Shape; Point cloud; 3D deep learning; transformer; cross-attention; NETWORK;
D O I
10.1109/TIP.2024.3486612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a point-based cross-attention transformer named CrossPoints with parametric Global Porous Sampling (GPS) strategy. The attention module is crucial to capture the correlations between different tokens for transformers. Most existing point-based transformers design multi-scale self-attention operations with down-sampled point clouds by the widely-used Farthest Point Sampling (FPS) strategy. However, FPS only generates sub-clouds with holistic structures, which fails to fully exploit the flexibility of points to generate diversified tokens for the attention module. To address this, we design a cross-attention module with parametric GPS and Complementary GPS (C-GPS) strategies to generate series of diversified tokens through controllable parameters. We show that FPS is a degenerated case of GPS, and the network learns more abundant relational information of the structure and geometry when we perform consecutive cross-attention over the tokens generated by GPS as well as C-GPS sampled points. More specifically, we set evenly-sampled points as queries and design our cross-attention layers with GPS and C-GPS sampled points as keys and values. In order to further improve the diversity of tokens, we design a deformable operation over points to adaptively adjust the points according to the input. Extensive experimental results on both shape classification and indoor scene segmentation tasks indicate promising boosts over the recent point cloud transformers. We also conduct ablation studies to show the effectiveness of our proposed cross-attention module with GPS strategy.
引用
收藏
页码:6283 / 6297
页数:15
相关论文
共 50 条
  • [31] A novel optimized machine learning approach with texture rectified cross-attention based transformer for COVID-19 detection
    Schafftar, C. Binu Jeya
    Radhakrishnan, A.
    Prema, C. Emmy
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101
  • [32] Global Cross-Attention Network for Single-Sensor Multispectral Imaging
    Yuan, Nianzeng
    Li, Junhuai
    Sun, Bangyong
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 240 - 252
  • [33] Word2Pix: Word to Pixel Cross-Attention Transformer in Visual Grounding
    Zhao, Heng
    Zhou, Joey Tianyi
    Ong, Yew-Soon
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 1523 - 1533
  • [34] TSMCF: Transformer-Based SAR and Multispectral Cross-Attention Fusion for Cloud Removal
    Zhu, Hongming
    Wang, Zeju
    Han, Letong
    Xu, Manxin
    Li, Weiqi
    Liu, Qin
    Liu, Sicong
    Du, Bowen
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 6710 - 6720
  • [35] Cross-attention Spatio-temporal Context Transformer for Semantic Segmentation of Historical Maps
    Wu, Sidi
    Chen, Yizi
    Schindler, Konrad
    Hurni, Lorenz
    31ST ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2023, 2023, : 106 - 114
  • [36] Spatio-spectral Cross-Attention Transformer for Hyperspectral image and Multispectral image fusion
    Qin, Xilei
    Song, Huihui
    Fan, Jiaqing
    Zhang, Kaihua
    REMOTE SENSING LETTERS, 2023, 14 (12) : 1303 - 1314
  • [37] Remote sensing image change detection based on swin transformer and cross-attention mechanism
    Yan, Weidong
    Cao, Li
    Yan, Pei
    Zhu, Chaosheng
    Wang, Mengtian
    EARTH SCIENCE INFORMATICS, 2025, 18 (01)
  • [38] MedTrans: Intelligent Computing for Medical Diagnosis Using Multiscale Cross-Attention Vision Transformer
    Xu, Yang
    Hong, Yuan
    Li, Xinchen
    Hu, Mu
    IEEE ACCESS, 2024, 12 : 146575 - 146586
  • [39] Reducing carbon emissions in the architectural design process via transformer with cross-attention mechanism
    Li, Huadong
    Yang, Xia
    Zhu, Hai Luo
    FRONTIERS IN ECOLOGY AND EVOLUTION, 2023, 11
  • [40] Dual cross-attention Transformer network for few-shot image semantic segmentation
    Liu, Yu
    Guo, Yingchun
    Zhu, Ye
    Yu, Ming
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2024, 39 (11) : 1494 - 1505