Bilateral Knowledge Interaction Network for Referring Image Segmentation

被引:7
|
作者
Ding, Haixin [1 ]
Zhang, Shengchuan [1 ]
Wu, Qiong [1 ]
Yu, Songlin [1 ]
Hu, Jie [1 ]
Cao, Liujuan [1 ]
Ji, Rongrong [1 ]
机构
[1] Xiamen Univ, Key Lab Multimedia Trusted Percept & Efficient Com, Minist Educ China, Xiamen 361005, Peoples R China
关键词
Image segmentation; Visualization; Kernel; Knowledge engineering; Feature extraction; Semantics; Convolution; Referring image segmentation; vision-language; AGGREGATION;
D O I
10.1109/TMM.2023.3305869
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Referring image segmentation aims to segment objects that are described by natural language expressions. Although remarkable advancements have been made to align natural language expressions with visual representations for better performance, the interaction between image-level and text-level information is still not formulated properly. Most of the previous works focus on building correlations between vision and language, ignoring the variety of objects. The target objects with unique appearances may not be correctly located or completely segmented. In this article, we propose a novel Bilateral Knowledge Interaction Network, termed BKINet, which reformulates the image-text interaction in a bilateral manner to adapt concrete knowledge of the target object in the image. BKINet contains two key components: a knowledge learning module (KLM) and a knowledge applying module (KAM). In the KLM, the abstract knowledge from text features is replenished with concrete knowledge from visual features to adapt to the target objects in the input images, which generates the knowledge interaction kernels (KI kernels) containing abundant referring information. With the referring information of KI kernels, the KAM is designed to highlight the most relevant visual features for predicting the accurate segmentation mask. Extensive experiments on three widely-used datasets, i.e. RefCOCO, RefCOCO+, and G-ref, demonstrate the superiority of BKINet over the state-of-the-art.
引用
收藏
页码:2966 / 2977
页数:12
相关论文
共 50 条
  • [31] Referring Image Segmentation Using Text Supervision
    Liu, Fang
    Liu, Yuhao
    Kong, Yuqiu
    Xu, Ke
    Zhang, Lihe
    Yin, Baocai
    Hancke, Gerhard
    Lau, Rynson
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22067 - 22077
  • [32] Distillation and Supplementation of Features for Referring Image Segmentation
    Tan, Zeyu
    Xu, Dahong
    Li, Xi
    Liu, Hong
    IEEE ACCESS, 2024, 12 : 171269 - 171279
  • [33] Image Segmentation With Language Referring Expression and Comprehension
    Sun, Jiaxing
    Li, Yujie
    Cai, Jintong
    Lu, Huimin
    Serikawa, Seiichi
    IEEE SENSORS JOURNAL, 2022, 22 (18) : 17406 - 17413
  • [34] Referring Image Segmentation by Generative Adversarial Learning
    Qiu, Shuang
    Zhao, Yao
    Jiao, Jianbo
    Wei, Yunchao
    Wei, Shikui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (05) : 1333 - 1344
  • [35] Contrastive Grouping with Transformer for Referring Image Segmentation
    Tang, Jiajin
    Zheng, Ge
    Shi, Cheng
    Yang, Sibei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23570 - 23580
  • [36] ReMamber: Referring Image Segmentation with Mamba Twister
    Yang, Yuhuan
    Ma, Chaofan
    Yao, Jiangchao
    Zhong, Zhun
    Zhang, Ya
    Wang, Yanfeng
    COMPUTER VISION - ECCV 2024, PT X, 2025, 15068 : 108 - 126
  • [37] Referring Image Segmentation Without Text Annotations
    Liu, Jing
    Jiang, Huajie
    Bi, Yandong
    Hu, Yongli
    Yin, Baocai
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024, 2024, 14873 : 278 - 293
  • [38] REFERRING IMAGE SEGMENTATION FOR REMOTE SENSING DATA
    Yuan, Zhenghang
    Mou, Lichao
    Hua, Yuansheng
    Zhu, Xiao Xiang
    IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, 2024, : 946 - 949
  • [39] Mixed-scale cross-modal fusion network for referring image segmentation
    Pan, Xiong
    Xie, Xuemei
    Yang, Jianxiu
    NEUROCOMPUTING, 2025, 614
  • [40] FLPK-BiSeNet: Federated Learning Based on Priori Knowledge and Bilateral Segmentation Network for Image Edge Extraction
    Teng, Lin
    Qiao, Yulong
    Shafiq, Muhammad
    Srivastava, Gautam
    Javed, Abdul Rehman
    Gadekallu, Thippa Reddy
    Yin, Shoulin
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2023, 20 (02): : 1529 - 1542