Text-Vision Relationship Alignment for Referring Image Segmentation

被引：0

作者：

Pu, Mingxing ^{[1
]}

Luo, Bing ^{[1
]}

Zhang, Chao ^{[2
]}

Xu, Li ^{[3
]}

Xu, Fayou ^{[1
]}

Kong, Mingming ^{[1
]}

机构：

[1] Xihua Univ, Sch Comp & Software Engn, Chengdu 610039, Peoples R China

[2] Sichuan Police Coll, Key Lab Intelligent Policing, Luzhou 646000, Peoples R China

[3] Xihua Univ, Sch Sci, Chengdu 610039, Peoples R China

来源：

NEURAL PROCESSING LETTERS | 2024年 / 56卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Semantic parsing; Text-vision alignment; Referring image segmentation;

D O I：

10.1007/s11063-024-11487-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Referring image segmentation aims to segment object in an image based on a referring expression. Its difficulty lies in aligning expression semantics with visual instances. The existing methods based on semantic reasoning are limited by the performance of external syntax parser and do not explicitly explore the relationships between visual instances. This article proposes an end-to-end method for referring image segmentation by aligning 'linguistic relationship' with 'visual relationships'. This method does not rely on external syntax parser for expression parsing. In this paper, the expression is adaptively and structurally parsed into three components: 'subject', 'object', and 'linguistic relationship' by the Semantic Component Parser (SCP) in a learnable manner. Instances Activation Map Module (IAM) locates multiple visual instances based on the subject and object. In addition, the Relationship Based Visual Localization Module (RBVL) firstly enables each instance of the image to learn global knowledge, then decodes the visual relationships between these visual instances, and finally aligns the visual relationships with the linguistic relationships to further accurately locate the target object. The experimental results show that the proposed method improves performance by 4- 9% compared with baseline method on multiple referring image segmentation datasets.

引用

页数：21

共 50 条

[1] Text-Vision Relationship Alignment for Referring Image Segmentation
Mingxing Pu
Bing Luo
Chao Zhang
Li Xu
Fayou Xu
Mingming Kong
Neural Processing Letters, 56
[2] Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image Segmentation
Lei, Sen
Xiao, Xinyu
Zhang, Tianlin
Li, Heng-Chao
Shi, Zhenwei
Zhu, Qing
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[3] Referring Image Segmentation Using Text Supervision
Liu, Fang
Liu, Yuhao
Kong, Yuqiu
Xu, Ke
Zhang, Lihe
Yin, Baocai
Hancke, Gerhard
Lau, Rynson
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22067 - 22077
[4] Referring Image Segmentation Without Text Annotations
Liu, Jing
Jiang, Huajie
Bi, Yandong
Hu, Yongli
Yin, Baocai
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024, 2024, 14873 : 278 - 293
[5] See-Through-Text Grouping for Referring Image Segmentation
Chen, Ding-Jie
Jia, Songhao
Lo, Yi-Chen
Chen, Hwann-Tzong
Liu, Tyng-Luh
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7453 - 7462
[6] Shatter and Gather: Learning Referring Image Segmentation with Text Supervision
Kim, Dongwon
Kim, Namyup
Lan, Cuiling
Kwak, Suha
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15501 - 15511
[7] Vision-Aware Language Reasoning for Referring Image Segmentation
Fayou Xu
Bing Luo
Chao Zhang
Li Xu
Mingxing Pu
Bo Li
Neural Processing Letters, 2023, 55 : 11313 - 11331
[8] Vision-Aware Language Reasoning for Referring Image Segmentation
Xu, Fayou
Luo, Bing
Zhang, Chao
Xu, Li
Pu, Mingxing
Li, Bo
NEURAL PROCESSING LETTERS, 2023, 55 (08) : 11313 - 11331
[9] LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Yang, Zhao
Wang, Jiaqi
Tang, Yansong
Chen, Kai
Zhao, Hengshuang
Torr, Philip H. S.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18134 - 18144
[10] Bidirectional Relationship Inferring Network for Referring Image Localization and Segmentation
Feng, Guang
Hu, Zhiwei
Zhang, Lihe
Sun, Jiayu
Lu, Huchuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (05) : 2246 - 2258

← 1 2 3 4 5 →