Amodal instance segmentation with dual guidance from contextual and shape priors

被引:0
|
作者
Zhan, Jiao [1 ]
Luo, Yarong [1 ]
Guo, Chi [1 ,2 ]
Wu, Yejun [3 ]
Yang, Bohan [1 ]
Wang, Jingrong [1 ]
Liu, Jingnan [1 ]
机构
[1] Wuhan Univ, GNSS Res Ctr, Wuhan 430072, Hubei, Peoples R China
[2] Hubei Luojia Lab, Wuhan 430079, Hubei, Peoples R China
[3] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China
基金
中国博士后科学基金;
关键词
Instance segmentation; Amodal instance segmentation; Pixel affinity; Contextual dependency;
D O I
10.1016/j.asoc.2024.112602
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human perception possesses the remarkable ability to mentally reconstruct the complete structure of occluded objects, which has inspired researchers to pursue amodal instance segmentation for a more comprehensive understanding of the scene. Previous works have shown promising results, but they often capture the contextual dependencies in an unsupervised way, which can lead to undesirable contextual dependencies and unreasonable feature representations. To tackle this problem, we propose a Pixel Affinity-Parsing (PAP) module trained with the Pixel Affinity Loss (PAL). Embedded into CNN, the PAP module can leverage learned contextual priors to guide the network to explicitly distinguish different relationships between pixels, thus capturing the intraclass and inter-class contextual dependencies in a non-local and supervised way. This process helps to yield robust feature representations to prevent the network from misjudging. To demonstrate the effectiveness of the PAP module, we design an effective Pixel Affinity-Parsing Network (PAPNet). Notably, PAPNet also introduces shape priors to guide the amodal mask refinement process, thus preventing implausible shapes in the predicted masks. Consequently, with the dual guidance of contextual and shape priors, PAPNet can reconstruct the full shape of occluded objects accurately and reasonably. Experimental results demonstrate that the proposed PAPNet outperforms existing state-of-the-art methods on multiple amodal datasets. Specifically, on the KINS dataset, PAPNet achieves 37.1% AP, 60.6% AP50 and 39.8% AP75, surpassing C2F-Seg by 0.6%, 2.4% and 2.8%. On the D2SA dataset, PAPNet achieves 71.70% AP, 85.98% AP50 and 77.10% AP75, surpassing PGExp by 0.75% and 0.33% in AP50 and AP75, and being comparable to PGExp in AP. On the COCOA-cls dataset, PAPNet achieves 41.29% AP, 60.95% AP50 and 46.17% AP75, surpassing PGExp by 3.74%, 3.21% and 4.76%. On the CWALT dataset, PAPNet achieves 72.51% AP, 85.02% AP50 and 80.47% AP75, surpassing VRSPNet by 5.38%, 0.07% and 5.35%. The code is available at https://github.com/jiaoZ7688/PAP-Net.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Amodal Instance Segmentation
    Li, Ke
    Malik, Jitendra
    COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 677 - 693
  • [2] GIN: Generative INvariant Shape Prior for Amodal Instance Segmentation
    Li, Zhixuan
    Ye, Weining
    Jiang, Tingting
    Huang, Tiejun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3924 - 3936
  • [3] ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation
    Tran, Minh (minht@uark.edu), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [4] LEARNING VECTOR QUANTIZED SHAPE CODE FOR AMODAL BLASTOMERE INSTANCE SEGMENTATION
    Jang, Won-Dong
    Wei, Donglai
    Zhang, Xingxuan
    Leahy, Brian
    Yang, Helen
    Tompkin, James
    Ben-Yosef, Dalit
    Needleman, Daniel
    Pfister, Hanspeter
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [5] Amodal Instance Segmentation with KINS Dataset
    Qi, Lu
    Jiang, Li
    Liu, Shu
    Shen, Xiaoyong
    Jia, Jiaya
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3009 - 3018
  • [6] Layered Embeddings for Amodal Instance Segmentation
    Liu, Yanfeng
    Psota, Eric T.
    Perez, Lance C.
    IMAGE ANALYSIS AND RECOGNITION, ICIAR 2019, PT I, 2019, 11662 : 102 - 111
  • [7] One-Shot Shape-Based Amodal-to-Modal Instance Segmentation
    Li, Andrew
    Danielczuk, Michael
    Goldber, Ken
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2020, : 1375 - 1382
  • [8] 2D Amodal Instance Segmentation Guided by 3D Shape Prior
    Li, Zhixuan
    Ye, Weining
    Jiang, Tingting
    Huang, Tiejun
    COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 165 - 181
  • [9] Amodal Instance Segmentation via Prior-Guided Expansion
    Chen, Junjie
    Niu, Li
    Zhang, Jianfu
    Si, Jianlou
    Qian, Chen
    Zhang, Liqing
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 313 - 321
  • [10] Amodal Segmentation Based on Visible Region Segmentation and Shape Prior
    Xiao, Yuting
    Xu, Yanyu
    Zhong, Ziming
    Luo, Weixin
    Li, Jiawei
    Gao, Shenghua
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2995 - 3003