Bidirectional Relationship Inferring Network for Referring Image Localization and Segmentation

被引:11
|
作者
Feng, Guang [1 ]
Hu, Zhiwei [1 ]
Zhang, Lihe [1 ]
Sun, Jiayu [1 ]
Lu, Huchuan [1 ]
机构
[1] Dalian Univ Technol, Sch Informat & Commun Engn, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Image segmentation; Location awareness; Visualization; Task analysis; Linguistics; Semantics; Feature extraction; Language-guided visual attention; referring image localization and segmentation; segmentation-guided feature augmentation; vision-guided linguistic attention (VLAM);
D O I
10.1109/TNNLS.2021.3106153
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, referring image localization and segmentation has aroused widespread interest. However, the existing methods lack a clear description of the interdependence between language and vision. To this end, we present a bidirectional relationship inferring network (BRINet) to effectively address the challenging tasks. Specifically, we first employ a vision-guided linguistic attention module to perceive the keywords corresponding to each image region. Then, language-guided visual attention adopts the learned adaptive language to guide the update of the visual features. Together, they form a bidirectional cross-modal attention module (BCAM) to achieve the mutual guidance between language and vision. They can help the network align the cross-modal features better. Based on the vanilla language-guided visual attention, we further design an asymmetric language-guided visual attention, which significantly reduces the computational cost by modeling the relationship between each pixel and each pooled subregion. In addition, a segmentation-guided bottom-up augmentation module (SBAM) is utilized to selectively combine multilevel information flow for object localization. Experiments show that our method outperforms other state-of-the-art methods on three referring image localization datasets and four referring image segmentation datasets.
引用
收藏
页码:2246 / 2258
页数:13
相关论文
共 50 条
  • [41] Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation
    Liu, Sihan
    Ma, Yiwei
    Zhang, Xiaoqing
    Wang, Haowei
    Ji, Jiayi
    Sun, Xiaoshuai
    Ji, Rongrong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26648 - 26658
  • [42] Mixed-scale cross-modal fusion network for referring image segmentation
    Pan, Xiong
    Xie, Xuemei
    Yang, Jianxiu
    NEUROCOMPUTING, 2025, 614
  • [43] Referring Image Segmentation via Joint Mask Contextual Embedding Learning and Progressive Alignment Network
    Huang, Ziling
    Satoh, Shin'ichi
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7753 - 7762
  • [44] Cross-modal fusion encoder via graph neural network for referring image segmentation
    Zhang, Yuqing
    Zhang, Yong
    Piao, Xinglin
    Yuan, Peng
    Hu, Yongli
    Yin, Baocai
    IET IMAGE PROCESSING, 2024, 18 (04) : 1083 - 1095
  • [45] PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
    Liu, Jiang
    Ding, Hui
    Cai, Zhaowei
    Zhang, Yuting
    Satzoda, Ravi Kumar
    Mahadevan, Vijay
    Manmatha, R.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18653 - 18663
  • [46] CRIS: CLIP-Driven Referring Image Segmentation
    Wang, Zhaoqing
    Lu, Yu
    Li, Qiang
    Tao, Xunqiang
    Guo, Yandong
    Gong, Mingming
    Liu, Tongliang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11676 - 11685
  • [47] Attentive Excitation and Aggregation for Bilingual Referring Image Segmentation
    Zhou, Qianli
    Hui, Tianrui
    Wang, Rong
    Hu, Haimiao
    Liu, Si
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (02)
  • [48] A survey of methods for addressing the challenges of referring image segmentation
    Ji, Lixia
    Du, Yunlong
    Dang, Yiping
    Gao, Wenzhao
    Zhang, Han
    NEUROCOMPUTING, 2024, 583
  • [49] Locate then Segment: A Strong Pipeline for Referring Image Segmentation
    Jing, Ya
    Kong, Tao
    Wang, Wei
    Wang, Liang
    Li, Lei
    Tan, Tieniu
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9853 - 9862
  • [50] Learning From Box Annotations for Referring Image Segmentation
    Feng, Guang
    Zhang, Lihe
    Hu, Zhiwei
    Lu, Huchuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3927 - 3937