Learning Cross-Attention Point Transformer With Global Porous Sampling

被引:0
|
作者
Duan, Yueqi [1 ]
Sun, Haowen [2 ]
Yan, Juncheng [2 ]
Lu, Jiwen [2 ]
Zhou, Jie [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Global Positioning System; Convolution; Three-dimensional displays; Geometry; Feature extraction; Training data; Sun; Shape; Point cloud; 3D deep learning; transformer; cross-attention; NETWORK;
D O I
10.1109/TIP.2024.3486612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a point-based cross-attention transformer named CrossPoints with parametric Global Porous Sampling (GPS) strategy. The attention module is crucial to capture the correlations between different tokens for transformers. Most existing point-based transformers design multi-scale self-attention operations with down-sampled point clouds by the widely-used Farthest Point Sampling (FPS) strategy. However, FPS only generates sub-clouds with holistic structures, which fails to fully exploit the flexibility of points to generate diversified tokens for the attention module. To address this, we design a cross-attention module with parametric GPS and Complementary GPS (C-GPS) strategies to generate series of diversified tokens through controllable parameters. We show that FPS is a degenerated case of GPS, and the network learns more abundant relational information of the structure and geometry when we perform consecutive cross-attention over the tokens generated by GPS as well as C-GPS sampled points. More specifically, we set evenly-sampled points as queries and design our cross-attention layers with GPS and C-GPS sampled points as keys and values. In order to further improve the diversity of tokens, we design a deformable operation over points to adaptively adjust the points according to the input. Extensive experimental results on both shape classification and indoor scene segmentation tasks indicate promising boosts over the recent point cloud transformers. We also conduct ablation studies to show the effectiveness of our proposed cross-attention module with GPS strategy.
引用
收藏
页码:6283 / 6297
页数:15
相关论文
共 50 条
  • [41] DCCAT: Dual-Coordinate Cross-Attention Transformer for thrombus segmentation on coronary OCT
    Chu, Miao
    De Maria, Giovanni Luigi
    Dai, Ruobing
    Benenati, Stefano
    Yu, Wei
    Zhong, Jiaxin
    Kotronias, Rafail
    Walsh, Jason
    Andreaggi, Stefano
    Zuccarelli, Vittorio
    Chai, Jason
    Channon, Keith
    Banning, Adrian
    Tu, Shengxian
    MEDICAL IMAGE ANALYSIS, 2024, 97
  • [42] Twins transformer: Cross-attention based two-branch transformer network for rotating bearing fault diagnosis
    Li, Jie
    Bao, Yu
    Liu, Wenxin
    Ji, Pengxiang
    Wang, Lekang
    Wang, Zhongbing
    MEASUREMENT, 2023, 223
  • [43] Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning
    Dasgupta, Agnibh
    Thong, Xin
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1125 - 1132
  • [44] Video question answering via grounded cross-attention network learning
    Ye, Yunan
    Zhang, Shifeng
    Li, Yimeng
    Qian, Xufeng
    Tang, Siliang
    Pu, Shiliang
    Xiao, Jun
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)
  • [45] Multi-task Learning with Auxiliary Cross-attention Transformer for Low-Resource Multi-dialect Speech Recognition
    Dan, Zhengjia
    Zhao, Yue
    Bi, Xiaojun
    Wu, Licheng
    Ji, Qiang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 107 - 118
  • [46] Cross-lingual AMR Aligner: Paying Attention to Cross-Attention
    Lorenzo, Abelardo Carlos Martinez
    Cabot, Pere-Lluis Huguet
    Navigli, Roberto
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1726 - 1742
  • [47] DTCA: Dual-Branch Transformer with Cross-Attention for EEG and Eye Movement Data Fusion
    Zhang, Xiaoshan
    Shi, Enze
    Yu, Sigang
    Zhang, Shu
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT II, 2024, 15002 : 141 - 151
  • [48] CAT: A Simple yet Effective Cross-Attention Transformer for One-Shot Object Detection
    Lin, Wei-Dong
    Deng, Yu-Yan
    Gao, Yang
    Wang, Ning
    Liu, Ling-Qiao
    Zhang, Lei
    Wang, Peng
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (02) : 460 - 471
  • [49] Image Caption with Synchronous Cross-Attention
    Wang, Yue
    Liu, Jinlai
    Wang, Xiaojie
    PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 433 - 441
  • [50] CAF-ViT: A cross-attention based Transformer network for underwater acoustic target recognition
    Dong, Wenfeng
    Fu, Jin
    Zou, Nan
    Zhao, Chunpeng
    Miao, Yixin
    Shen, Zheng
    OCEAN ENGINEERING, 2025, 318