IS-GGT: Iterative Scene Graph Generation with Generative Transformers

被引:6
|
作者
Kundu, Sanjoy [1 ]
Aakur, Sathyanarayanan N. [1 ]
机构
[1] Oklahoma State Univ, Dept Comp Sci, Stillwater, OK 74078 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52729.2023.00609
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene graphs provide a rich, structured representation of a scene by encoding the entities (objects) and their spatial relationships in a graphical format. This representation has proven useful in several tasks, such as question answering, captioning, and even object detection, to name a few. Current approaches take a generation-by-classification approach where the scene graph is generated through labeling of all possible edges between objects in a scene, which adds computational overhead to the approach. This work introduces a generative transformer-based approach to generating scene graphs beyond link prediction. Using two transformer-based components, we first sample a possible scene graph structure from detected objects and their visual features. We then perform predicate classification on the sampled edges to generate the final scene graph. This approach allows us to efficiently generate scene graphs from images with minimal inference overhead. Extensive experiments on the Visual Genome dataset demonstrate the efficiency of the proposed approach. Without bells and whistles, we obtain, on average, 20.7% mean recall (mR@100) across different settings for scene graph generation (SGG), outperforming state-of-the-art SGG approaches while offering competitive performance to unbiased SGG approaches.
引用
收藏
页码:6292 / 6301
页数:10
相关论文
共 50 条
  • [41] Informative Scene Graph Generation via Debiasing
    Gao, Lianli
    Lyu, Xinyu
    Guo, Yuyu
    Hu, Yuxuan
    Li, Yuan-Fang
    Xu, Lu
    Shen, Heng Tao
    Song, Jingkuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [42] Kumaraswamy Wavelet for Heterophilic Scene Graph Generation
    Chen, Lianggangxu
    Song, Youqi
    Lin, Shaohui
    Wang, Changbo
    He, Gaoqi
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1138 - 1146
  • [43] Neural Belief Propagation for Scene Graph Generation
    Liu, Daqi
    Bober, Miroslaw
    Kittler, Josef
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10161 - 10172
  • [44] Multimodal Context Embedding for Scene Graph Generation
    Jung, Gayoung
    Kim, Incheol
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (06): : 1250 - 1260
  • [45] Quaternion Relation Embedding for Scene Graph Generation
    Wang, Zheng
    Xu, Xing
    Wang, Guoqing
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8646 - 8656
  • [46] Generation of Shadows in Scene Graph based VR
    Kuehl, Bjoern
    Blom, Kristopher J.
    Beckhaus, Steffi
    WSCG 2007, FULL PAPERS PROCEEDINGS I AND II, 2007, : 295 - 302
  • [47] Constrained Structure Learning for Scene Graph Generation
    Liu, Daqi
    Bober, Miroslaw
    Kittler, Josef
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 11588 - 11599
  • [48] Consistent Scene Graph Generation by Constraint Optimization
    Chen, Boqi
    Marussy, Kristof
    Pilarski, Sebastian
    Semerath, Oszkar
    Varro, Daniel
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [49] Complex Relation Embedding for Scene Graph Generation
    Wang, Zheng
    Xu, Xing
    Zhang, Yin
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (06) : 8321 - 8335
  • [50] Text Generation from Knowledge Graphs with Graph Transformers
    Koncel-Kedziorski, Rik
    Bekal, Dhanush
    Luan, Yi
    Lapata, Mirella
    Hajishirzi, Hannaneh
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2284 - 2293