Improving Weakly Supervised Scene Graph Parsing through Object Grounding

被引:0
|
作者
Zhang, Yizhou [1 ]
Zheng, Zhaoheng [1 ]
Nevatia, Ram [1 ]
Liu, Yan [1 ]
机构
[1] Univ Southern Calif Angeles, Dept Comp Sci, Los Angeles, CA 90089 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised scene graph parsing, which learns structured image representations without annotated correspondences between graph nodes and visual objects, has been prevalent in recent computer vision research. Existing methods mainly focus on designing task-specific loss functions, model architectures, or optimization algorithms. We argue that correspondences between objects and graph nodes are crucial for the weakly supervised scene graph parsing task and are worth learning explicitly. Thus we propose GroParser, a framework that improves weakly supervised scene graph parsing models by grounding visual objects. The proposed weakly supervised grounding method learns a metric among visual objects and scene graph nodes by incorporating information from both object features and relational features. Specifically, we apply multi-instance learning to learn the object category information and exploit a two-stream graph neural network to model the relational similarity metric. Extensive experiments on the scene graph parsing task verify the grounding found by our model can reinforce the performance of the existing weakly supervised scene graph parsing methods, including the current state-of-the-art. Further experiments on Visual Genome (VG) and Visual Relation Detection (VRD) datasets verify that our model brings an improvement on scene graph grounding task over existing approaches.
引用
收藏
页码:4058 / 4064
页数:7
相关论文
共 50 条
  • [41] Improving weakly-supervised object localization using adversarial erasing and label
    Kang, Byeongkeun
    Cha, Sinhae
    Lee, Yeejin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [42] Weakly Supervised Training For Parsing Mandarin Broadcast Transcripts
    Wang, Wen
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2446 - 2449
  • [43] Visual Reranking through Weakly Supervised Multi-Graph Learning
    Deng, Cheng
    Ji, Rongrong
    Liu, Wei
    Tao, Dacheng
    Gao, Xinbo
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2600 - 2607
  • [44] Weakly Supervised Semantic Parsing by Learning from Mistakes
    Guo, Jiaqi
    Lou, Jian-Guang
    Liu, Ting
    Zhang, Dongmei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2603 - 2617
  • [45] Nonnegative Matrix Cofactorization for Weakly Supervised Image Parsing
    Zhang, Guodong
    Gong, Xiaojin
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (11) : 1682 - 1686
  • [46] LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation
    Kim, Kibum
    Yoon, Kanghoon
    Jeon, Jaehyeong
    In, Yeonjun
    Moon, Jinyoung
    Kim, Donghyun
    Park, Chanyoung
    Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2024, : 28306 - 28316
  • [47] Detector-Free Weakly Supervised Grounding by Separation
    Arbelle, Assaf
    Doveh, Sivan
    Alfassy, Amit
    Shtok, Joseph
    Lev, Guy
    Schwartz, Eli
    Kuehne, Hilde
    Levi, Hila Barak
    Sattigeri, Prasanna
    Panda, Rameswar
    Chen, Chun-Fu
    Bronstein, Alex
    Saenko, Kate
    Ullman, Shimon
    Giryes, Raja
    Feris, Rogerio
    Karlinsky, Leonid
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1781 - 1792
  • [48] Weakly Supervised Multimodal Affordance Grounding for Egocentric Images
    Xu, Lingjing
    Gao, Yang
    Song, Wenfeng
    Hao, Aimin
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6324 - 6332
  • [49] Weakly Supervised Temporal Adjacent Network for Language Grounding
    Wang, Yuechen
    Deng, Jiajun
    Zhou, Wengang
    Li, Houqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 24 : 3276 - 3286
  • [50] Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
    Chen, Kan
    Gao, Jiyang
    Nevatia, Ram
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4042 - 4050