Bidirectional Relationship Inferring Network for Referring Image Localization and Segmentation

被引:11
|
作者
Feng, Guang [1 ]
Hu, Zhiwei [1 ]
Zhang, Lihe [1 ]
Sun, Jiayu [1 ]
Lu, Huchuan [1 ]
机构
[1] Dalian Univ Technol, Sch Informat & Commun Engn, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Image segmentation; Location awareness; Visualization; Task analysis; Linguistics; Semantics; Feature extraction; Language-guided visual attention; referring image localization and segmentation; segmentation-guided feature augmentation; vision-guided linguistic attention (VLAM);
D O I
10.1109/TNNLS.2021.3106153
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, referring image localization and segmentation has aroused widespread interest. However, the existing methods lack a clear description of the interdependence between language and vision. To this end, we present a bidirectional relationship inferring network (BRINet) to effectively address the challenging tasks. Specifically, we first employ a vision-guided linguistic attention module to perceive the keywords corresponding to each image region. Then, language-guided visual attention adopts the learned adaptive language to guide the update of the visual features. Together, they form a bidirectional cross-modal attention module (BCAM) to achieve the mutual guidance between language and vision. They can help the network align the cross-modal features better. Based on the vanilla language-guided visual attention, we further design an asymmetric language-guided visual attention, which significantly reduces the computational cost by modeling the relationship between each pixel and each pooled subregion. In addition, a segmentation-guided bottom-up augmentation module (SBAM) is utilized to selectively combine multilevel information flow for object localization. Experiments show that our method outperforms other state-of-the-art methods on three referring image localization datasets and four referring image segmentation datasets.
引用
收藏
页码:2246 / 2258
页数:13
相关论文
共 50 条
  • [1] Bi-directional Relationship Inferring Network for Referring Image Segmentation
    Hu, Zhiwei
    Feng, Guang
    Sun, Jiayu
    Zhang, Lihe
    Lu, Huchuan
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4423 - 4432
  • [2] Prompt-guided bidirectional deep fusion network for referring image segmentation
    Wu, Junxian
    Zhang, Yujia
    Kampffmeyer, Michael
    Zhao, Xiaoguang
    NEUROCOMPUTING, 2025, 616
  • [3] Structured Attention Network for Referring Image Segmentation
    Lin, Liang
    Yan, Pengxiang
    Xu, Xiaoqian
    Yang, Sibei
    Zeng, Kun
    Li, Guanbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
  • [4] Bidirectional Mask Selection for Zero-Shot Referring Image Segmentation
    Li, Wenhui
    Pang, Chao
    Nie, Weizhi
    Tian, Hongshuo
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 911 - 921
  • [5] Structured Multimodal Fusion Network for Referring Image Segmentation
    Xue, Mingcheng
    Liu, Yu
    Xu, Kaiping
    Zhang, Haiyang
    Yu, Chengyang
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 36 - 47
  • [6] Dual Convolutional LSTM Network for Referring Image Segmentation
    Ye, Linwei
    Liu, Zhi
    Wang, Yang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3224 - 3235
  • [7] Query Reconstruction Network for Referring Expression Image Segmentation
    Shi, Hengcan
    Li, Hongliang
    Wu, Qingbo
    Ngan, King Ngi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 995 - 1007
  • [8] PRNet: A Progressive Refinement Network for referring image segmentation
    Liu, Jing
    Jiang, Huajie
    Hu, Yongli
    Yin, Baocai
    NEUROCOMPUTING, 2025, 630
  • [9] A CONTEXT-BASED NETWORK FOR REFERRING IMAGE SEGMENTATION
    Li, Xinyu
    Liu, Yu
    Xu, Kaiping
    Zhao, Zhehuan
    Liu, Sipei
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1436 - 1440
  • [10] Bilateral Knowledge Interaction Network for Referring Image Segmentation
    Ding, Haixin
    Zhang, Shengchuan
    Wu, Qiong
    Yu, Songlin
    Hu, Jie
    Cao, Liujuan
    Ji, Rongrong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2966 - 2977