Discrete codebook collaborating with transformer for thangka image inpainting

被引:0
|
作者
Bai, Jinxian [1 ]
Fan, Yao [1 ]
Zhao, Zhiwei [1 ]
机构
[1] Xizang Minzu Univ, Sch Informat Engn, Xianyang 712000, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Image inpainting; Thangka images; Transformer; Cross-shaped window attention; Codebook;
D O I
10.1007/s00530-024-01439-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Thangka, as a precious heritage of painting art, holds irreplaceable research value due to its richness in Tibetan history, religious beliefs, and folk culture. However, it is susceptible to partial damage and form distortion due to natural erosion or inadequate conservation measures. Given the complexity of textures and rich semantics in thangka images, existing image inpainting methods struggle to recover their original artistic style and intricate details. In this paper, we propose a novel approach combining discrete codebook learning with a transformer for image inpainting, tailored specifically for thangka images. In the codebook learning stage, we design an improved network framework based on vector quantization (VQ) codebooks to discretely encode intermediate features of input images, yielding a context-rich discrete codebook. The second phase introduces a parallel transformer module based on a cross-shaped window, which efficiently predicts the index combinations for missing regions under limited computational cost. Furthermore, we devise a multi-scale feature guidance module that progressively fuses features from intact areas with textural features from the codebook, thereby enhancing the preservation of local details in non-damaged regions. We validate the efficacy of our method through qualitative and quantitative experiments on datasets including Celeba-HQ, Places2, and a custom thangka dataset. Experimental results demonstrate that compared to previous methods, our approach successfully reconstructs images with more complete structural information and clearer textural details.
引用
收藏
页数:17
相关论文
共 50 条
  • [11] Continuously Masked Transformer for Image Inpainting
    Ko, Keunsoo
    Kim, Chang-Su
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13123 - 13132
  • [12] DLFormer: Discrete Latent Transformer for Video Inpainting
    School of Computer Science and Engineering, South China University of Technology, China
    不详
    不详
    Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, 1600, (3501-3510):
  • [13] No -Reference Quality Assessment Method for Inpainting Thangka Image Based on Multiple Features
    Ye Yuqi
    Hu Wenjin
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (08)
  • [14] Damaged region filling and evaluation by symmetrical exemplar-based image inpainting for Thangka
    Weilan Wang
    Yanjun Jia
    EURASIP Journal on Image and Video Processing, 2017
  • [15] Damaged region filling and evaluation by symmetrical exemplar-based image inpainting for Thangka
    Wang, Weilan
    Jia, Yanjun
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [16] Bidirectional interaction of CNN and Transformer for image inpainting
    Liu, Jialu
    Gong, Maoguo
    Gao, Yuan
    Lu, Yiheng
    Li, Hao
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [17] A transformer–CNN for deep image inpainting forensics
    Xinshan Zhu
    Junyan Lu
    Honghao Ren
    Hongquan Wang
    Biao Sun
    The Visual Computer, 2023, 39 : 4721 - 4735
  • [18] IMAGE INPAINTING BY MSCSWIN TRANSFORMER ADVERSARIAL AUTOENCODER
    Chen, Bo-Wei
    Liu, Tsung-Jung
    Liu, Kuan-Hsien
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2040 - 2044
  • [19] TSFormer: Tracking Structure Transformer for Image Inpainting
    Lin, Jiayu
    Wang, Yuan-gen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (12)
  • [20] Edge-Guided Image Inpainting with Transformer
    Liang, Huining
    Kambhamettu, Chandra
    ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT II, 2023, 14362 : 285 - 296