Learning Cross-Attention Point Transformer With Global Porous Sampling

被引:0
|
作者
Duan, Yueqi [1 ]
Sun, Haowen [2 ]
Yan, Juncheng [2 ]
Lu, Jiwen [2 ]
Zhou, Jie [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Global Positioning System; Convolution; Three-dimensional displays; Geometry; Feature extraction; Training data; Sun; Shape; Point cloud; 3D deep learning; transformer; cross-attention; NETWORK;
D O I
10.1109/TIP.2024.3486612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a point-based cross-attention transformer named CrossPoints with parametric Global Porous Sampling (GPS) strategy. The attention module is crucial to capture the correlations between different tokens for transformers. Most existing point-based transformers design multi-scale self-attention operations with down-sampled point clouds by the widely-used Farthest Point Sampling (FPS) strategy. However, FPS only generates sub-clouds with holistic structures, which fails to fully exploit the flexibility of points to generate diversified tokens for the attention module. To address this, we design a cross-attention module with parametric GPS and Complementary GPS (C-GPS) strategies to generate series of diversified tokens through controllable parameters. We show that FPS is a degenerated case of GPS, and the network learns more abundant relational information of the structure and geometry when we perform consecutive cross-attention over the tokens generated by GPS as well as C-GPS sampled points. More specifically, we set evenly-sampled points as queries and design our cross-attention layers with GPS and C-GPS sampled points as keys and values. In order to further improve the diversity of tokens, we design a deformable operation over points to adaptively adjust the points according to the input. Extensive experimental results on both shape classification and indoor scene segmentation tasks indicate promising boosts over the recent point cloud transformers. We also conduct ablation studies to show the effectiveness of our proposed cross-attention module with GPS strategy.
引用
收藏
页码:6283 / 6297
页数:15
相关论文
共 50 条
  • [1] Cross-Attention Transformer for Video Interpolation
    Kim, Hannah Halin
    Yu, Shuzhi
    Yuan, Shuai
    Tomasi, Carlo
    COMPUTER VISION - ACCV 2022 WORKSHOPS, 2023, 13848 : 325 - 342
  • [2] SCATT: Transformer tracking with symmetric cross-attention
    Zhang, Jianming
    Chen, Wentao
    Dai, Jiangxin
    Zhang, Jin
    APPLIED INTELLIGENCE, 2024, 54 (08) : 6069 - 6084
  • [3] Deblurring transformer tracking with conditional cross-attention
    Sun, Fuming
    Zhao, Tingting
    Zhu, Bing
    Jia, Xu
    Wang, Fasheng
    MULTIMEDIA SYSTEMS, 2023, 29 (03) : 1131 - 1144
  • [4] Deblurring transformer tracking with conditional cross-attention
    Fuming Sun
    Tingting Zhao
    Bing Zhu
    Xu Jia
    Fasheng Wang
    Multimedia Systems, 2023, 29 : 1131 - 1144
  • [5] Deformable Cross-Attention Transformer for Medical Image Registration
    Chen, Junyu
    Liu, Yihao
    He, Yufan
    Du, Yong
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT I, 2024, 14348 : 115 - 125
  • [6] CAR-Transformer: Cross-Attention Reinforcement Transformer for Cross-Lingual Summarization
    Cai, Yuang
    Yuan, Yuyu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17718 - 17726
  • [7] Unsupervised Domain Adaptation via Bidirectional Cross-Attention Transformer
    Wang, Xiyu
    Guo, Pengxin
    Zhang, Yu
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 309 - 325
  • [8] Cross-Attention Fusion Learning of Transformer-CNN Features for Person Re-Identification
    Xiang, Jun
    Zhang, Jincheng
    Jiang, Xiaoping
    Hou, Jianhua
    Computer Engineering and Applications, 2024, 60 (16) : 94 - 104
  • [9] DcTr: Noise-robust point cloud completion by dual-channel transformer with cross-attention
    Fei, Ben
    Yang, Weidong
    Ma, Lipeng
    Chen, Wen-Ming
    PATTERN RECOGNITION, 2023, 133
  • [10] Optimization-Inspired Cross-Attention Transformer for Compressive Sensing
    Song, Jiechong
    Mou, Chong
    Wang, Shiqi
    Ma, Siwei
    Zhang, Jian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6174 - 6184