IS-GGT: Iterative Scene Graph Generation with Generative Transformers

被引:6
|
作者
Kundu, Sanjoy [1 ]
Aakur, Sathyanarayanan N. [1 ]
机构
[1] Oklahoma State Univ, Dept Comp Sci, Stillwater, OK 74078 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52729.2023.00609
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene graphs provide a rich, structured representation of a scene by encoding the entities (objects) and their spatial relationships in a graphical format. This representation has proven useful in several tasks, such as question answering, captioning, and even object detection, to name a few. Current approaches take a generation-by-classification approach where the scene graph is generated through labeling of all possible edges between objects in a scene, which adds computational overhead to the approach. This work introduces a generative transformer-based approach to generating scene graphs beyond link prediction. Using two transformer-based components, we first sample a possible scene graph structure from detected objects and their visual features. We then perform predicate classification on the sampled edges to generate the final scene graph. This approach allows us to efficiently generate scene graphs from images with minimal inference overhead. Extensive experiments on the Visual Genome dataset demonstrate the efficiency of the proposed approach. Without bells and whistles, we obtain, on average, 20.7% mean recall (mR@100) across different settings for scene graph generation (SGG), outperforming state-of-the-art SGG approaches while offering competitive performance to unbiased SGG approaches.
引用
收藏
页码:6292 / 6301
页数:10
相关论文
共 50 条
  • [11] Attribute Prototype-Guided Iterative Scene Graph for Explainable Radiology Report Generation
    Zhang, Ke
    Yang, Yan
    Yu, Jun
    Fan, Jianping
    Jiang, Hanliang
    Huang, Qingming
    Han, Weidong
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (12) : 4470 - 4482
  • [12] Unconditional Scene Graph Generation
    Garg, Sarthak
    Dhamo, Helisa
    Farshad, Azade
    Musatian, Sabrina
    Navab, Nassir
    Tombari, Federico
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16342 - 16351
  • [13] Panoptic Scene Graph Generation
    Yang, Jingkang
    Ang, Yi Zhe
    Guo, Zujin
    Zhou, Kaiyang
    Zhang, Wayne
    Liu, Ziwei
    COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 178 - 196
  • [14] Iterative Learning with Extra and Inner Knowledge for Long-tail Dynamic Scene Graph Generation
    Li, Yiming
    Yang, Xiaoshan
    Xu, Changsheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4707 - 4715
  • [15] Beware of Overcorrection: Scene-induced Commonsense Graph for Scene Graph Generation
    Chen, Lianggangxu
    Lu, Jiale
    Song, Youqi
    Wang, Changbo
    He, Gaoqi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2888 - 2897
  • [16] Multimodal graph inference network for scene graph generation
    Jingwen Duan
    Weidong Min
    Deyu Lin
    Jianfeng Xu
    Xin Xiong
    Applied Intelligence, 2021, 51 : 8768 - 8783
  • [17] Multimodal graph inference network for scene graph generation
    Duan, Jingwen
    Min, Weidong
    Lin, Deyu
    Xu, Jianfeng
    Xiong, Xin
    APPLIED INTELLIGENCE, 2021, 51 (12) : 8768 - 8783
  • [18] Graph R-CNN for Scene Graph Generation
    Yang, Jianwei
    Lu, Jiasen
    Lee, Stefan
    Batra, Dhruv
    Parikh, Devi
    COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 690 - 706
  • [19] Scene Graph Generation: A comprehensive survey
    Li, Hongsheng
    Zhu, Guangming
    Zhang, Liang
    Jiang, Youliang
    Dang, Yixuan
    Hou, Haoran
    Shen, Peiyi
    Zhao, Xia
    Shah, Syed Afaq Ali
    Bennamoun, Mohammed
    NEUROCOMPUTING, 2024, 566
  • [20] Relation Regularized Scene Graph Generation
    Guo, Yuyu
    Gao, Lianli
    Song, Jingkuan
    Wang, Peng
    Sebe, Nicu
    Shen, Heng Tao
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (07) : 5961 - 5972