FormerUnify: Transformer-Based Unified Fusion for Efficient Image Matting

被引:0
|
作者
Wang, Jiaquan [1 ]
机构
[1] Shanghai Univ, Shanghai 200444, Peoples R China
关键词
Image matting; Transformer; Feature pyramid; Unified fusion;
D O I
10.1007/978-981-97-8685-5_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, deep learning-based methods in the field of image matting have incorporated additional modules and complex network structures to capture more comprehensive image information, thereby achieving higher accuracy. However, these innovations inevitably result in a decrement of inference speed and higher computational resource consumption. In this paper, we propose a Transformer-based unified fusion network for image matting, denoted as FormerUnify. Compared to existing methods, it is able to achieve a more optimal balance between accuracy and efficiency. FormerUnify is built upon the classic encoder-decoder framework, with its centerpiece being the Unified Fusion Decoder. This decoder is composed of three essential layers: unify layer, fusion layer, and upsampling prediction head, all of which work in concert to unify and fuse the rich multi-scale features extracted by the encoder effectively. Furthermore, we couple the Unified Fusion Decoder with an advanced Transformer-based encoder, and optimize their integration to enhance their compatibility and performance. Experimental evaluations on two synthetic datasets (Composition-1K and Distinctions-646) and an real-world dataset (AIM-500) affirm that FormerUnify achieves rapid inference speed without compromising its superior accuracy.
引用
收藏
页码:412 / 425
页数:14
相关论文
共 50 条
  • [41] A Novel Transformer-Based Attention Network for Image Dehazing
    Gao, Guanlei
    Cao, Jie
    Bao, Chun
    Hao, Qun
    Ma, Aoqi
    Li, Gang
    SENSORS, 2022, 22 (09)
  • [42] Transformer-based image generation from scene graphs
    Sortino, Renato
    Palazzo, Simone
    Rundo, Francesco
    Spampinato, Concetto
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 233
  • [43] Symmetric transformer-based network for unsupervised image registration
    Ma, Mingrui
    Xu, Yuanbo
    Song, Lei
    Liu, Guixia
    KNOWLEDGE-BASED SYSTEMS, 2022, 257
  • [44] A Transformer-Based Network for Deformable Medical Image Registration
    Wang, Yibo
    Qian, Wen
    Li, Mengqi
    Zhang, Xuming
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 502 - 513
  • [45] Transformer-based image captioning by leveraging sentence information
    Chahkandi, Vahid
    Fadaeieslam, Mohammad Javad
    Yaghmaee, Farzin
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (04)
  • [46] Transformer-Based Distillation Hash Learning for Image Retrieval
    Lv, Yuanhai
    Wang, Chongyan
    Yuan, Wanteng
    Qian, Xiaohao
    Yang, Wujun
    Zhao, Wanqing
    ELECTRONICS, 2022, 11 (18)
  • [47] AN EFFICIENT TRANSFORMER-BASED MODEL FOR VOICE ACTIVITY DETECTION
    Zhao, Yifei
    Champagne, Benoit
    2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
  • [48] A transformer-based unified multimodal framework for Alzheimer's disease assessment
    Department of Big Data in Health Science, School of Public Health and Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang, Hangzhou, China
    不详
    130024, China
    Comput. Biol. Med.,
  • [49] Multiscale Image Matting Based Multi-Focus Image Fusion Technique
    Maqsood, Sarmad
    Javed, Umer
    Riaz, Muhammad Mohsin
    Muzammil, Muhammad
    Muhammad, Fazal
    Kim, Sunghwan
    ELECTRONICS, 2020, 9 (03)
  • [50] TFNet: Transformer-Based Multi-Scale Feature Fusion Forest Fire Image Detection Network
    Liu, Hongying
    Zhang, Fuquan
    Xu, Yiqing
    Wang, Junling
    Lu, Hong
    Wei, Wei
    Zhu, Jun
    FIRE-SWITZERLAND, 2025, 8 (02):