Coarse Mask Guided Interactive Object Segmentation

被引:3
|
作者
Li, Jing [1 ,2 ]
Fan, Junsong [3 ,4 ]
Wang, Yuxi [3 ,4 ]
Yang, Yuran [5 ]
Zhang, Zhaoxiang [4 ,6 ,7 ,8 ]
机构
[1] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100190, Peoples R China
[3] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Beijing 100190, Peoples R China
[4] HKISI CAS, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
[5] Tencent Maps, Beijing 100101, Peoples R China
[6] Chinese Acad Sci CASIA, Inst Automat, Beijing 100190, Peoples R China
[7] Univ Chinese Acad Sci UCAS, Sch Future Technol, Beijing 100049, Peoples R China
[8] State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Segmentation; interactive; transformer; annotation tool; RANDOM-WALKS; IMAGE; CUT;
D O I
10.1109/TIP.2023.3322564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interactive object segmentation aims to produce object masks with user interactions, such as clicks, bounding boxes, and scribbles. Click point is the most popular interactive cue for its efficiency, and related deep learning methods have attracted lots of interest in recent years. Most works encode click points as gaussian maps and concatenate them with images as the model's input. However, the spatial and semantic information of gaussian maps would be noised through multiple convolution layers and won't be fully exploited by top layers for mask prediction. To pass click information to top layers exactly and efficiently, we propose a coarse mask guided model (CMG) which predicts coarse masks with a coarse module to guide the object mask prediction. Specifically, the coarse module encodes user clicks as query features and enriches their semantic information with backbone features through transformer layers, coarse masks are generated based on the enriched query feature and fed into CMG's decoder. Benefiting from the efficiency of transformer, CMG's coarse module and decoder module are lightweight and computationally efficient, making the interaction process more smooth. Experiments on several segmentation benchmarks demonstrate the effectiveness of our method, and we get new state-of-the-art results compared with previous works.
引用
收藏
页码:5808 / 5822
页数:15
相关论文
共 50 条
  • [1] Fast Video Object Segmentation by Reference-Guided Mask Propagation
    Oh, Seoung Wug
    Lee, Joon-Young
    Sunkavalli, Kalyan
    Kim, Seon Joo
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7376 - 7385
  • [2] Distance-Guided Mask Propagation Model for Efficient Video Object Segmentation
    Liu, Jiajia
    Dai, Hongning
    Li, Bo
    Tang, Gaozhong
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [3] Interactive Video Object Mask Annotation
    Trung-Nghia Le
    Nguyen, Tam, V
    Quoc-Cuong Tran
    Lam Nguyen
    Trung-Hieu Hoang
    Minh-Quan Le
    Minh-Triet Tran
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16067 - 16070
  • [4] Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps
    Heo, Yuk
    Koh, Yeong Jun
    Kim, Chang-Su
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7318 - 7326
  • [5] Mask encoding: A general instance mask representation for object segmentation
    Zhang, Rufeng
    Kong, Tao
    Wang, Xinlong
    You, Mingyu
    PATTERN RECOGNITION, 2022, 124
  • [6] Mask encoding: A general instance mask representation for object segmentation
    Zhang, Rufeng
    Kong, Tao
    Wang, Xinlong
    You, Mingyu
    Pattern Recognition, 2022, 124
  • [7] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
    Cheng, Ho Kei
    Tai, Yu-Wing
    Tang, Chi-Keung
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5555 - 5564
  • [8] Interactive object segmentation in two phases
    Shi, Ran
    Ngan, King Ngi
    Li, Songnan
    Li, Hongliang
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 65 : 107 - 114
  • [9] Fast interactive system for object segmentation
    Yang, Yijun
    Zhao, Rongchun
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2000, 18 (02): : 289 - 292
  • [10] VIDEO OBJECT SEGMENTATION WITH ONLINE MASK REFINEMENT
    Sawada, Tomoya
    Lee, Teng-Yok
    Mizuno, Masahiro
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,