Neural Motifs: Scene Graph Parsing with Global Context

被引:642
|
作者
Zellers, Rowan [1 ]
Yatskar, Mark [1 ,2 ]
Thomson, Sam [3 ]
Choi, Yejin [1 ,2 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
[2] Allen Inst Artificial Intelligence, Seattle, WA USA
[3] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR.2018.00611
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the problem of producing structured graph representations of visual scenes. Our work analyzes the role of motifs: regularly appearing substructures in scene graphs. We present new quantitative insights on such repeated structures in the Visual Genome dataset. Our analysis shows that object labels are highly predictive of relation labels but not vice-versa. We also find that there are recurring patterns even in larger subgraphs: more than 50% of graphs contain motifs involving at least two relations. Our analysis motivates a new baseline: given object detections, predict the most frequent relation between object pairs with the given labels, as seen in the training set. This baseline improves on the previous state-of-the-art by an average of 3.6% relative improvement across evaluation settings. We then introduce Stacked Motif Networks, a new architecture designed to capture higher order motifs in scene graphs that further improves over our strong baseline by an average 7.1% relative gain. Our code is available at github.com/rowanz/neural-motifs.
引用
收藏
页码:5831 / 5840
页数:10
相关论文
共 50 条
  • [1] Structured Neural Motifs: Scene Graph Parsing via Enhanced Context
    Li, Yiming
    Yang, Xiaoshan
    Xu, Changsheng
    MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 175 - 188
  • [2] Scene Parsing with Global Context Embedding
    Hung, Wei-Chih
    Tsai, Yi-Hsuan
    Shen, Xiaohui
    Lin, Zhe
    Sunkavalli, Kalyan
    Lu, Xin
    Yang, Ming-Hsuan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2650 - 2658
  • [3] Learning to transfer focus of graph neural network for scene graph parsing
    Jiang, Junjie
    He, Zaixing
    Zhang, Shuyou
    Zhao, Xinyue
    Tan, Jianrong
    PATTERN RECOGNITION, 2021, 112
  • [4] Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
    Zorzi, Stefano
    Fraundorfer, Friedrich
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16716 - 16725
  • [5] Global-Guided Selective Context Network for Scene Parsing
    Jiang, Jie
    Liu, Jing
    Fu, Jun
    Zhu, Xinxin
    Li, Zechao
    Lu, Hanqing
    IEEE Transactions on Neural Networks and Learning Systems, 2022, 33 (04): : 1752 - 1764
  • [6] Global-Guided Selective Context Network for Scene Parsing
    Jiang, Jie
    Liu, Jing
    Fu, Jun
    Zhu, Xinxin
    Li, Zechao
    Lu, Hanqing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1752 - 1764
  • [7] Adaptive Context Network for Scene Parsing
    Fu, Jun
    Liu, Jing
    Wang, Yuhang
    Li, Yong
    Bao, Yongjun
    Tang, Jinhui
    Lu, Hanqing
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6747 - 6756
  • [8] Hierarchical Parsing Net: Semantic Scene Parsing From Global Scene to Objects
    Shi, Hengcan
    Li, Hongliang
    Meng, Fanman
    Wu, Qingbo
    Xu, Linfeng
    Ngan, King Ngi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (10) : 2670 - 2682
  • [9] Graphical Contrastive Losses for Scene Graph Parsing
    Zhang, Ji
    Shih, Kevin J.
    Elgammal, Ahmed
    Tao, Andrew
    Catanzaro, Bryan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11527 - 11535
  • [10] Similarity Based Context for Nonparametric Scene Parsing
    Alinia, Parvaneh
    Razzaghi, Parvin
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1509 - 1514