Head-Free Lightweight Semantic Segmentation with Linear Transformer

被引:0
|
作者
Dong, Bo [1 ]
Wang, Pichao [1 ]
Wang, Fan [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the com-putational load introduced by the overall structure has long been ignored, which hinders their applications on resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for semantic segmentation, named Adaptive Frequency Transformer (AFFormer). AFFormer adopts a parallel architecture to leverage prototype representations as specific learnable local descriptions which replaces the decoder and preserves the rich image semantics on high-resolution features. Although removing the decoder compresses most of the computation, the accuracy of the parallel structure is still limited by low computational resources. Therefore, we employ heterogeneous operators (CNN and Vision Transformer) for pixel embedding and prototype representations to further save computational costs. Moreover, it is very difficult to linearize the complexity of the vision Transformer from the perspective of spatial domain. Due to the fact that semantic segmentation is very sensitive to frequency information, we construct a lightweight prototype learning block with adaptive frequency filter of complexity O(n) to replace standard self attention with O(n2). Extensive experiments on widely adopted datasets demonstrate that AFFormer achieves superior accuracy while retaining only 3M parameters. On the ADE20K dataset, AFFormer achieves 41.8 mIoU and 4.6 GFLOPs, which is 4.4 mIoU higher than Segformer, with 45% less GFLOPs. On the Cityscapes dataset, AFFormer achieves 78.7 mIoU and 34.4 GFLOPs, which is 2.5 mIoU higher than Segformer with 72.5% less GFLOPs. Code is available at https://github.com/dongbo811/AFFormer.
引用
收藏
页码:516 / 524
页数:9
相关论文
共 50 条
  • [41] A Unified Efficient Pyramid Transformer for Semantic Segmentation
    Zhu, Fangrui
    Zhu, Yi
    Zhang, Li
    Wu, Chongruo
    Fu, Yanwei
    Li, Mu
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 2667 - 2677
  • [42] MMSFormer: Multimodal Transformer for Material and Semantic Segmentation
    Reza, Md Kaykobad
    Prater-Bennette, Ashley
    Asif, M. Salman
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 599 - 610
  • [43] CoT: Contourlet Transformer for Hierarchical Semantic Segmentation
    Shao, Yilin
    Sun, Long
    Jiao, Licheng
    Liu, Xu
    Liu, Fang
    Li, Lingling
    Yang, Shuyuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 132 - 146
  • [44] Semantic segmentation using tag label and transformer
    Jeong S.-W.
    Kim E.-C.
    Yoo J.
    Journal of Institute of Control, Robotics and Systems, 2021, 27 (12) : 1029 - 1037
  • [45] Kinematic Models for Eye-Head Coordination During Head-Free Gaze Shifts
    Daemi, Mehdi
    Crawford, John Douglas
    CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2012, 66 (04): : 291 - 291
  • [46] CoT: Contourlet Transformer for Hierarchical Semantic Segmentation
    Shao, Yilin
    Sun, Long
    Jiao, Licheng
    Liu, Xu
    Liu, Fang
    Li, Lingling
    Yang, Shuyuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 132 - 146
  • [47] MarsFormer: Martian Rock Semantic Segmentation With Transformer
    Xiong, Yonggang
    Xiao, Xueming
    Yao, Meibao
    Liu, Haiqiang
    Yang, Hong
    Fu, Yuegang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [48] LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer
    Zhang, Xiangyue
    Li, Hexiao
    Ru, Jingyu
    Ji, Peng
    Wu, Chengdong
    ELECTRONICS, 2024, 13 (12)
  • [49] A Lightweight CNN-Transformer Network With Laplacian Loss for Low-Altitude UAV Imagery Semantic Segmentation
    Lu, Wen
    Zhang, Zhiqi
    Nguyen, Minh
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 20
  • [50] Keeping balance during head-free smooth pursuit: The role of aging
    Georgiadis, Petros
    Chatzinikolaou, Konstantinos
    Voudouris, Dimitrios
    Van Dieen, Jaap
    Hatzitaki, Vassilia
    HUMAN MOVEMENT SCIENCE, 2023, 87