Advancing Real-World Stereoscopic Image Super-Resolution via Vision-Language Model

被引:0
|
作者
Zhang, Zhe [1 ,2 ]
Lei, Jianjun [1 ]
Peng, Bo [1 ]
Zhu, Jie [1 ]
Xu, Liying [1 ]
Huang, Qingming [3 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Tianjin Univ Commerce, Sch Informat Engn, Tianjin 300134, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Stereo image processing; Degradation; Superresolution; Visualization; Image reconstruction; Training; Iterative methods; Solid modeling; Computational modeling; Cognition; Super-resolution; stereoscopic image; vision-language model;
D O I
10.1109/TIP.2025.3546470
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed the remarkable success of the vision-language model in various computer vision tasks. However, how to exploit the semantic language knowledge of the vision-language model to advance real-world stereoscopic image super-resolution remains a challenging problem. This paper proposes a vision-language model-based stereoscopic image super-resolution (VLM-SSR) method, in which the semantic language knowledge in CLIP is exploited to facilitate stereoscopic image SR in a training-free manner. Specifically, by designing visual prompts for CLIP to infer the region similarity, a prompt-guided information aggregation mechanism is presented to capture inter-view information among relevant regions between the left and right views. Besides, driven by the prior knowledge of CLIP, a cognition prior-driven iterative enhancing mechanism is presented to optimize fuzzy regions adaptively. Experimental results on four datasets verify the effectiveness of the proposed method.
引用
收藏
页码:2187 / 2197
页数:11
相关论文
共 50 条
  • [41] Investigating Tradeoffs in Real-World Video Super-Resolution
    Chan, Kelvin C. K.
    Zhou, Shangchen
    Xu, Xiangyu
    Loy, Chen Change
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5952 - 5961
  • [42] Learning Degradation for Real-World Face Super-Resolution
    Chen, Jin
    Chen, Jun
    Wang, Xiaofen
    Xu, Dongshu
    Liang, Chao
    Han, Zhen
    ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT II, 2024, 14496 : 120 - 131
  • [43] Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
    Cheng, Kanzhi
    Song, Wenpo
    Ma, Zheng
    Zhu, Wenhao
    Zhu, Zixuan
    Zhang, Jianbing
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5038 - 5047
  • [44] Real-world super-resolution based on iterative frequency domain degradation model
    Hao, Yukun
    Liu, Yuchen
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [45] Dual Adversarial Adaptation for Cross-Device Real-World Image Super-Resolution
    Xu, Xiaoqian
    Wei, Pengxu
    Chen, Weikai
    Liu, Yang
    Mao, Mingzhi
    Lin, Liang
    Li, Guanbin
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5657 - 5666
  • [46] Frequency-Aware Degradation Modeling for Real-World Thermal Image Super-Resolution
    Qu, Chao
    Chen, Xiaoyu
    Xu, Qihan
    Han, Jing
    ENTROPY, 2024, 26 (03)
  • [47] Unsupervised Degradation Aware and Representation for Real-World Remote Sensing Image Super-Resolution
    Guo, Wen-Zhong
    Weng, Wu-Ding
    Chen, Guang-Yong
    Su, Jian-Nan
    Gan, Min
    Philip Chen, C. L.
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 1
  • [48] DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution
    Zhou, Yuanbo
    Zhang, Xinlin
    Deng, Wei
    Wang, Tao
    Tan, Tao
    Gao, Qinquan
    Tong, Tong
    NEUROCOMPUTING, 2025, 623
  • [49] Exploiting Degradation Prior for Personalized Federated Learning in Real-World Image Super-Resolution
    Yang, Yue
    Ke, Liangjun
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 146 - 154
  • [50] A Real-World Benchmark for Sentinel-2 Multi-Image Super-Resolution
    Pawel Kowaleczko
    Tomasz Tarasiewicz
    Maciej Ziaja
    Daniel Kostrzewa
    Jakub Nalepa
    Przemyslaw Rokita
    Michal Kawulok
    Scientific Data, 10