Open-Vocabulary And Multitask Image Segmentation

被引:0
|
作者
Pan, Lihu [1 ]
Yang, Yunting [1 ]
Wang, Zhengkui [2 ]
Shan, Wen [3 ]
Yin, Jaili [1 ]
机构
[1] Taiyuan Univ Sci & Technol, Taiyuan, Peoples R China
[2] Singapore Inst Technol, Infocomm Technol Cluster, Singapore, Singapore
[3] Singapore Univ Social Sci, Singapore, Singapore
关键词
Image segmentation; Adaptive prompt learning; Image-text fusion; Multitask;
D O I
10.1145/3605098.3636192
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Open-vocabulary learning has revolutionized image segmentation, enabling the delineation of arbitrary categories from textual descriptions. While current methods often employ specialized architectures, OVAMTSeg presents a unified framework for Open-Vocabulary and Multitask Image Segmentation. Leveraging adaptive prompt learning, OVAMTSeg excels in capturing category-sensitive concepts, ensuring robustness across diverse multi-task scenarios. Text prompts effectively capture semantic and contextual features, while cross-attention and cross-modal interactions facilitate seamless fusion of image and text features. The framework incorporates a transformer-based decoder for dense prediction. Experimental results demonstrate OVAMTSeg's effectiveness, achieving a 47.5 mIoU in referring expression segmentation, 51.6 mIoU on Pascal-VOC with four unseen classes, 46.6 mIoU on Pascal-Context in zero-shot segmentation, 65.9 mIoU on Pascal-5i, and 35.7 mIoU on COCO-20i datasets for one-shot segmentation.
引用
收藏
页码:1048 / 1049
页数:2
相关论文
共 50 条
  • [11] Generalization Boosted Adapter for Open-Vocabulary Segmentation
    Xu, Wenhao
    Wang, Changwei
    Feng, Xuxiang
    Xu, Rongtao
    Huang, Longzhao
    Zhang, Zherui
    Guo, Li
    Xu, Shibiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 520 - 533
  • [12] Open-vocabulary Object Segmentation with Diffusion Models
    Li, Ziyi
    Zhou, Qinye
    Zhang, Xiaoyun
    Zhang, Ya
    Wang, Yanfeng
    Xie, Weidi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7633 - 7642
  • [13] Going Denser with Open-Vocabulary Part Segmentation
    Sun, Peize
    Chen, Shoufa
    Zhu, Chenchen
    Xiao, Fanyi
    Luo, Ping
    Xie, Saining
    Yan, Zhicheng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15407 - 15419
  • [14] Towards Open-Vocabulary Video Instance Segmentation
    Wang, Haochen
    Yan, Cilin
    Wang, Shuai
    Jiang, Xiaolong
    Tang, Xu
    Hu, Yao
    Xie, Weidi
    Gavves, Efstratios
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 4034 - 4043
  • [15] A Simple Framework for Open-Vocabulary Segmentation and Detection
    Zhang, Hao
    Li, Feng
    Zou, Xueyan
    Liu, Shilong
    Li, Chunyuan
    Yang, Jianwei
    Zhang, Lei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1020 - 1031
  • [16] Side Adapter Network for Open-Vocabulary Semantic Segmentation
    Xu, Mengde
    Zhang, Zheng
    Wei, Fangyun
    Hu, Han
    Bai, Xiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2945 - 2954
  • [17] Open-Vocabulary RGB-Thermal Semantic Segmentation
    Zhao, Guoqiang
    Huang, Junjie
    Yan, Xiaoyun
    Wang, Zhaojing
    Tang, Junwei
    Ou, Yangjun
    Hu, Xinrong
    Peng, Tao
    COMPUTER VISION - ECCV 2024, PT LXXIV, 2025, 15132 : 304 - 320
  • [18] Open-Vocabulary Segmentation with Semantic-Assisted Calibration
    Liu, Yong
    Bai, Sule
    Li, Guanbin
    Wang, Yitong
    Tang, Yansong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 3491 - 3500
  • [19] Enhancing Open-Vocabulary Semantic Segmentation with Prototype Retrieval
    Barsellotti, Luca
    Amoroso, Roberto
    Baraldi, Lorenzo
    Cucchiara, Rita
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 196 - 208
  • [20] Global Knowledge Calibration for Fast Open-Vocabulary Segmentation
    Han, Kunyang
    Liu, Yong
    Liew, Jun Hao
    Ding, Henghui
    Liu, Jiajun
    Wang, Yitong
    Tang, Yansong
    Yang, Yujiu
    Feng, Jiashi
    Zhao, Yao
    Wei, Yunchao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 797 - 807