Bidirectional Relationship Inferring Network for Referring Image Localization and Segmentation

被引:11
|
作者
Feng, Guang [1 ]
Hu, Zhiwei [1 ]
Zhang, Lihe [1 ]
Sun, Jiayu [1 ]
Lu, Huchuan [1 ]
机构
[1] Dalian Univ Technol, Sch Informat & Commun Engn, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Image segmentation; Location awareness; Visualization; Task analysis; Linguistics; Semantics; Feature extraction; Language-guided visual attention; referring image localization and segmentation; segmentation-guided feature augmentation; vision-guided linguistic attention (VLAM);
D O I
10.1109/TNNLS.2021.3106153
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, referring image localization and segmentation has aroused widespread interest. However, the existing methods lack a clear description of the interdependence between language and vision. To this end, we present a bidirectional relationship inferring network (BRINet) to effectively address the challenging tasks. Specifically, we first employ a vision-guided linguistic attention module to perceive the keywords corresponding to each image region. Then, language-guided visual attention adopts the learned adaptive language to guide the update of the visual features. Together, they form a bidirectional cross-modal attention module (BCAM) to achieve the mutual guidance between language and vision. They can help the network align the cross-modal features better. Based on the vanilla language-guided visual attention, we further design an asymmetric language-guided visual attention, which significantly reduces the computational cost by modeling the relationship between each pixel and each pooled subregion. In addition, a segmentation-guided bottom-up augmentation module (SBAM) is utilized to selectively combine multilevel information flow for object localization. Experiments show that our method outperforms other state-of-the-art methods on three referring image localization datasets and four referring image segmentation datasets.
引用
收藏
页码:2246 / 2258
页数:13
相关论文
共 50 条
  • [31] RRSIS: Referring Remote Sensing Image Segmentation
    Yuan, Zhenghang
    Mou, Lichao
    Hua, Yuansheng
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
  • [32] Referring Image Segmentation Using Text Supervision
    Liu, Fang
    Liu, Yuhao
    Kong, Yuqiu
    Xu, Ke
    Zhang, Lihe
    Yin, Baocai
    Hancke, Gerhard
    Lau, Rynson
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22067 - 22077
  • [33] Distillation and Supplementation of Features for Referring Image Segmentation
    Tan, Zeyu
    Xu, Dahong
    Li, Xi
    Liu, Hong
    IEEE ACCESS, 2024, 12 : 171269 - 171279
  • [34] Image Segmentation With Language Referring Expression and Comprehension
    Sun, Jiaxing
    Li, Yujie
    Cai, Jintong
    Lu, Huimin
    Serikawa, Seiichi
    IEEE SENSORS JOURNAL, 2022, 22 (18) : 17406 - 17413
  • [35] Recurrent Multimodal Interaction for Referring Image Segmentation
    Liu, Chenxi
    Lin, Zhe
    Shen, Xiaohui
    Yang, Jimei
    Lu, Xin
    Yuille, Alan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1280 - 1289
  • [36] Referring Image Segmentation by Generative Adversarial Learning
    Qiu, Shuang
    Zhao, Yao
    Jiao, Jianbo
    Wei, Yunchao
    Wei, Shikui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (05) : 1333 - 1344
  • [37] Contrastive Grouping with Transformer for Referring Image Segmentation
    Tang, Jiajin
    Zheng, Ge
    Shi, Cheng
    Yang, Sibei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23570 - 23580
  • [38] ReMamber: Referring Image Segmentation with Mamba Twister
    Yang, Yuhuan
    Ma, Chaofan
    Yao, Jiangchao
    Zhong, Zhun
    Zhang, Ya
    Wang, Yanfeng
    COMPUTER VISION - ECCV 2024, PT X, 2025, 15068 : 108 - 126
  • [39] Referring Image Segmentation Without Text Annotations
    Liu, Jing
    Jiang, Huajie
    Bi, Yandong
    Hu, Yongli
    Yin, Baocai
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024, 2024, 14873 : 278 - 293
  • [40] REFERRING IMAGE SEGMENTATION FOR REMOTE SENSING DATA
    Yuan, Zhenghang
    Mou, Lichao
    Hua, Yuansheng
    Zhu, Xiao Xiang
    IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, 2024, : 946 - 949