Learning Semantic Alignment Using Global Features and Multi-Scale Confidence

被引:0
|
作者
Xu, Huaiyuan [1 ]
Liao, Jing [2 ]
Liu, Huaping [3 ]
Sun, Yuxiang [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[3] Tsinghua Univ, Inst Artificial Intelligence, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Correlation; Feature extraction; Transformers; Training; Task analysis; Probabilistic logic; Semantic alignment; enhancement transformer; probabilistic correlation computation; cross-domain alignment;
D O I
10.1109/TCSVT.2023.3288370
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.
引用
收藏
页码:897 / 910
页数:14
相关论文
共 50 条
  • [21] Learning Deep Global Multi-Scale and Local Attention Features for Facial Expression Recognition in the Wild
    Zhao, Zengqun
    Liu, Qingshan
    Wang, Shanmin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6544 - 6556
  • [22] Prediction of Aeroengine Remaining Life by Combining Multi-scale Local Features and Transformer Global Learning
    Chen, Jun-Ying
    Xi, Yue-Yun
    Li, Zhao-Yang
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (09): : 1818 - 1830
  • [23] Multi-Scale Dictionary Learning Using Wavelets
    Ophir, Boaz
    Lustig, Michael
    Elad, Michael
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (05) : 1014 - 1024
  • [24] Learning relevant features of data with multi-scale tensor networks
    Stoudenmire, E. Miles
    QUANTUM SCIENCE AND TECHNOLOGY, 2018, 3 (03):
  • [25] LEARNING MULTI-SCALE FEATURES FOR JPEG IMAGE ARTIFACTS REMOVAL
    Ji, Jiahuan
    Zhong, Baojiang
    Song, Weigang
    Ma, Kai-Kuang
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1565 - 1569
  • [26] LEARNING MULTI-SCALE ATTENTIVE FEATURES FOR SERIES PHOTO SELECTION
    Huang, Jin
    Cui, Chaoran
    Zhang, Chunyun
    Shen, Zhen
    Yu, Jun
    Yin, Yilong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2742 - 2746
  • [27] Global and Local Multi-scale Feature Fusion for Object Detection and Semantic Segmentation
    Lim, Young-Chul
    Kang, Minsung
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 2557 - 2562
  • [28] EnCoSum: enhanced semantic features for multi-scale multi-modal source code summarization
    Gao, Yuexiu
    Zhang, Hongyu
    Lyu, Chen
    EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (05)
  • [29] EnCoSum: enhanced semantic features for multi-scale multi-modal source code summarization
    Yuexiu Gao
    Hongyu Zhang
    Chen Lyu
    Empirical Software Engineering, 2023, 28
  • [30] Multi-orientation and multi-scale features discriminant learning for palmprint recognition
    Ma, Fei
    Zhu, Xiaoke
    Wang, Cailing
    Liu, Huajun
    Jing, Xiao-Yuan
    NEUROCOMPUTING, 2019, 348 : 169 - 178