Neural Motifs: Scene Graph Parsing with Global Context

被引：642

作者：

Zellers, Rowan ^{[1
]}

Yatskar, Mark ^{[1
,2
]}

Thomson, Sam ^{[3
]}

Choi, Yejin ^{[1
,2
]}

机构：

[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA

[2] Allen Inst Artificial Intelligence, Seattle, WA USA

[3] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR.2018.00611

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We investigate the problem of producing structured graph representations of visual scenes. Our work analyzes the role of motifs: regularly appearing substructures in scene graphs. We present new quantitative insights on such repeated structures in the Visual Genome dataset. Our analysis shows that object labels are highly predictive of relation labels but not vice-versa. We also find that there are recurring patterns even in larger subgraphs: more than 50% of graphs contain motifs involving at least two relations. Our analysis motivates a new baseline: given object detections, predict the most frequent relation between object pairs with the given labels, as seen in the training set. This baseline improves on the previous state-of-the-art by an average of 3.6% relative improvement across evaluation settings. We then introduce Stacked Motif Networks, a new architecture designed to capture higher order motifs in scene graphs that further improves over our strong baseline by an average 7.1% relative gain. Our code is available at github.com/rowanz/neural-motifs.

引用

页码：5831 / 5840

页数：10

共 50 条

[1] Structured Neural Motifs: Scene Graph Parsing via Enhanced Context
Li, Yiming
Yang, Xiaoshan
Xu, Changsheng
MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 175 - 188
[2] Scene Parsing with Global Context Embedding
Hung, Wei-Chih
Tsai, Yi-Hsuan
Shen, Xiaohui
Lin, Zhe
Sunkavalli, Kalyan
Lu, Xin
Yang, Ming-Hsuan
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2650 - 2658
[3] Learning to transfer focus of graph neural network for scene graph parsing
Jiang, Junjie
He, Zaixing
Zhang, Shuyou
Zhao, Xinyue
Tan, Jianrong
PATTERN RECOGNITION, 2021, 112
[4] Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
Zorzi, Stefano
Fraundorfer, Friedrich
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16716 - 16725
[5] Global-Guided Selective Context Network for Scene Parsing
Jiang, Jie
Liu, Jing
Fu, Jun
Zhu, Xinxin
Li, Zechao
Lu, Hanqing
IEEE Transactions on Neural Networks and Learning Systems, 2022, 33 (04): : 1752 - 1764
[6] Global-Guided Selective Context Network for Scene Parsing
Jiang, Jie
Liu, Jing
Fu, Jun
Zhu, Xinxin
Li, Zechao
Lu, Hanqing
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1752 - 1764
[7] Adaptive Context Network for Scene Parsing
Fu, Jun
Liu, Jing
Wang, Yuhang
Li, Yong
Bao, Yongjun
Tang, Jinhui
Lu, Hanqing
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6747 - 6756
[8] Hierarchical Parsing Net: Semantic Scene Parsing From Global Scene to Objects
Shi, Hengcan
Li, Hongliang
Meng, Fanman
Wu, Qingbo
Xu, Linfeng
Ngan, King Ngi
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (10) : 2670 - 2682
[9] Graphical Contrastive Losses for Scene Graph Parsing
Zhang, Ji
Shih, Kevin J.
Elgammal, Ahmed
Tao, Andrew
Catanzaro, Bryan
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11527 - 11535
[10] Similarity Based Context for Nonparametric Scene Parsing
Alinia, Parvaneh
Razzaghi, Parvin
2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1509 - 1514

← 1 2 3 4 5 →