Semantic-Aware Visual Decomposition for Image Coding

被引:3
|
作者
Chang, Jianhui [1 ]
Zhang, Jian [2 ]
Li, Jiguo [3 ]
Wang, Shiqi [4 ]
Mao, Qi [5 ]
Jia, Chuanmin [1 ]
Ma, Siwei [1 ]
Gao, Wen [1 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Visual Technol, Sch Comp Sci, Beijing 100871, Peoples R China
[2] Peking Univ, Sch Elect & Comp Engn, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
[4] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[5] Commun Univ China, State Key Lab Media Convergence & Commun, Beijing 100024, Peoples R China
基金
中国国家自然科学基金;
关键词
Image coding; Semantic-aware visual decomposition; Structure-texture; Coherency regularization; Extremely low bitrate;
D O I
10.1007/s11263-023-01809-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel image coding framework with semantic-aware visual decomposition towards extremely low bitrate compression. In particular, an input image is analyzed into a semantic map as structural representation and semantic-wise texture representation and further compressed into bitstreams at the encoder side. On the decoder side, the received bitstreams of dual-layer representations are decoded and reconstructed for target image synthesis with generative models. Moreover, the attention mechanism is introduced into the model architecture for texture representation modeling and a coherency regularization is proposed to further optimize the texture representation space by aligning the representation space with the source pixel space for higher synthesis quality. Besides, we also propose a cross-channel entropy module and control the quantization scale to facilitate rate-distortion optimization. Upon compressing the decomposed components into the bitstream, the simple yet effective representation philosophy benefits image compression in many aspects. First, in terms of compression performance, compact representations, and high visual synthesis quality can bring remarkable advantages. Second, the proposed framework yields a physically explainable bitstream composed of the structural segment and semantic-wise texture segments. Third and most importantly, subsequent vision tasks (e.g., content manipulation) can receive fundamental support from the semantic-aware visual decomposition and synthesis mechanism. Extensive experimental results demonstrate the superiority of the proposed framework towards efficient visual representation learning, high efficiency image compression (< 0.1 bpp), and intelligent visual applications (e.g., manipulation and analysis).
引用
收藏
页码:2333 / 2355
页数:23
相关论文
共 50 条
  • [21] SALIENCY-AWARE SEMANTIC IMAGE CODING FOR MOBILE VISUAL SEARCH
    Sun, Cuirong
    Li, Houqiang
    Li, Weiping
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 544 - 548
  • [22] Semantic-Aware Modular Capsule Routing for Visual Question Answering
    Han, Yudong
    Yin, Jianhua
    Wu, Jianlong
    Wei, Yinwei
    Nie, Liqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5537 - 5549
  • [23] Semantic-aware spatial regularization correlation filter for visual tracking
    Zha, Yufei
    Zhang, Peng
    Pu, Lei
    Zhang, Lichao
    IET COMPUTER VISION, 2022, 16 (04) : 317 - 332
  • [24] Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding
    Yang, Tianyu
    Wu, Hanzhou
    Yi, Biao
    Feng, Guorui
    Zhang, Xinpeng
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (01) : 139 - 152
  • [25] Semantic-aware and QoS-aware image caching in ad hoc networks
    Yang, Bo
    Hurson, Ali R.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (12) : 1694 - 1707
  • [26] Semantic-aware data quality assessment for image big data
    Liu, Yu
    Wang, Yangtao
    Zhou, Ke
    Yang, Yujuan
    Liu, Yifei
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 102 : 53 - 65
  • [27] Multi-level semantic-aware transformer for image captioning
    Xu, Qin
    Song, Shan
    Wu, Qihang
    Jiang, Bo
    Luo, Bin
    Tang, Jinhui
    NEURAL NETWORKS, 2025, 187
  • [28] Local and Global GANs With Semantic-Aware Upsampling for Image Generation
    Tang, Hao
    Shao, Ling
    Torr, Philip H. S.
    Sebe, Nicu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 768 - 784
  • [29] SARAH: Semantic-Aware Representation Balance Hashing for Image Retrieval
    Fan, Changlin
    Liang, Fengming
    Xiao, Bo
    Wu, Yugiong
    Yu, Jincheng
    Zhou, Shifei
    Li, Ye
    Sheng, Chunjie
    2022 INTERNATIONAL CONFERENCE ON FRONTIERS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, FAIML, 2022, : 164 - 168
  • [30] SEMANTIC-AWARE NETWORK FOR AERIAL-TO-GROUND IMAGE SYNTHESIS
    Jang, Jinhyun
    Song, Taeyong
    Sohn, Kwanghoon
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3862 - 3866