Improving Weakly Supervised Scene Graph Parsing through Object Grounding

被引:0
|
作者
Zhang, Yizhou [1 ]
Zheng, Zhaoheng [1 ]
Nevatia, Ram [1 ]
Liu, Yan [1 ]
机构
[1] Univ Southern Calif Angeles, Dept Comp Sci, Los Angeles, CA 90089 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised scene graph parsing, which learns structured image representations without annotated correspondences between graph nodes and visual objects, has been prevalent in recent computer vision research. Existing methods mainly focus on designing task-specific loss functions, model architectures, or optimization algorithms. We argue that correspondences between objects and graph nodes are crucial for the weakly supervised scene graph parsing task and are worth learning explicitly. Thus we propose GroParser, a framework that improves weakly supervised scene graph parsing models by grounding visual objects. The proposed weakly supervised grounding method learns a metric among visual objects and scene graph nodes by incorporating information from both object features and relational features. Specifically, we apply multi-instance learning to learn the object category information and exploit a two-stream graph neural network to model the relational similarity metric. Extensive experiments on the scene graph parsing task verify the grounding found by our model can reinforce the performance of the existing weakly supervised scene graph parsing methods, including the current state-of-the-art. Further experiments on Visual Genome (VG) and Visual Relation Detection (VRD) datasets verify that our model brings an improvement on scene graph grounding task over existing approaches.
引用
收藏
页码:4058 / 4064
页数:7
相关论文
共 50 条
  • [31] Uncertainty-Aware Graph-Guided Weakly Supervised Object Detection
    Zhu, Yueyi
    Zhang, Yongqiang
    Ding, Mingli
    Zuo, Wangmeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (07) : 3257 - 3269
  • [32] Multiple Instance Graph Learning for Weakly Supervised Remote Sensing Object Detection
    Wang, Binglu
    Zhao, Yongqiang
    Li, Xuelong
    IEEE Transactions on Geoscience and Remote Sensing, 2022, 60
  • [33] Multiple Instance Graph Learning for Weakly Supervised Remote Sensing Object Detection
    Wang, Binglu
    Zhao, Yongqiang
    Li, Xuelong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [34] Scene Recognition and Weakly Supervised Object Localization with Deformable Part-Based Models
    Pandey, Megha
    Lazebnik, Svetlana
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1307 - 1314
  • [35] Weakly Supervised Object Boundaries
    Khoreva, Anna
    Benenson, Rodrigo
    Omran, Mohamed
    Hein, Matthias
    Schiele, Bernt
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 183 - 192
  • [36] Graphical Contrastive Losses for Scene Graph Parsing
    Zhang, Ji
    Shih, Kevin J.
    Elgammal, Ahmed
    Tao, Andrew
    Catanzaro, Bryan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11527 - 11535
  • [37] Scene Parsing with Object Instances and Occlusion Ordering
    Tighe, Joseph
    Niethammer, Marc
    Lazebnik, Svetlana
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3748 - 3755
  • [38] Improving Predicate Representation in Scene Graph Generation by Self-Supervised Learning
    Hasegawa, So
    Hiromoto, Masayuki
    Nakagawa, Akira
    Umeda, Yuhei
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2739 - 2748
  • [39] Pointly-supervised scene parsing with uncertainty mixture
    Zhao, Hao
    Lu, Ming
    Yao, Anbang
    Guo, Yiwen
    Chen, Yurong
    Zhang, Li
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2020, 200 (200)
  • [40] Semantic Object Parsing with Graph LSTM
    Liang, Xiaodan
    Shen, Xiaohui
    Feng, Jiashi
    Lin, Liang
    Yan, Shuicheng
    COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 125 - 143