Structured Neural Motifs: Scene Graph Parsing via Enhanced Context

被引:3
|
作者
Li, Yiming [1 ,4 ]
Yang, Xiaoshan [2 ,3 ,4 ]
Xu, Changsheng [1 ,2 ,3 ,4 ]
机构
[1] HeFei Univ Technol, Hefei, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Peng Cheng Lab, Shenzhen, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Scene graph; Deep learning; LSTMs;
D O I
10.1007/978-3-030-37734-2_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene graph is one kind of structured representation of the visual content in an image. It is helpful for complex visual understanding tasks such as image captioning, visual question answering and semantic image retrieval. Since the real-world images always have multiple object instances and complex relationships, the context information is extremely important for scene graph generation. It has been noted that the context dependencies among different nodes in the scene graph are asymmetric, which meas it is highly possible to directly predict relationship labels based on object labels but not vice-versa. Based on this finding, the existing motifs network has successfully exploited the context patterns among object nodes and the dependencies between the object nodes and the relation nodes. However, the spatial information and the context dependencies among relation nodes are neglected. In this work, we propose Structured Motif Network (StrcMN) which predicts object labels and pairwise relationships by mining more complete global context features. The experiments show that our model significantly outperforms previous methods on the VRD and Visual Genome datasets.
引用
收藏
页码:175 / 188
页数:14
相关论文
共 50 条
  • [1] Neural Motifs: Scene Graph Parsing with Global Context
    Zellers, Rowan
    Yatskar, Mark
    Thomson, Sam
    Choi, Yejin
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5831 - 5840
  • [2] Learning to transfer focus of graph neural network for scene graph parsing
    Jiang, Junjie
    He, Zaixing
    Zhang, Shuyou
    Zhao, Xinyue
    Tan, Jianrong
    PATTERN RECOGNITION, 2021, 112
  • [3] Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
    Zorzi, Stefano
    Fraundorfer, Friedrich
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16716 - 16725
  • [4] Structured-NeRF: Hierarchical Scene Graph with Neural Representation
    Zhong, Zhide
    Cao, Jiakai
    Gu, Songen
    Xie, Sirui
    Luo, Liyi
    Zhao, Hao
    Zhou, Guyue
    Li, Haoang
    Yang, Zike
    COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 184 - 201
  • [5] Adaptive Context Network for Scene Parsing
    Fu, Jun
    Liu, Jing
    Wang, Yuhang
    Li, Yong
    Bao, Yongjun
    Tang, Jinhui
    Lu, Hanqing
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6747 - 6756
  • [6] Scene Parsing with Global Context Embedding
    Hung, Wei-Chih
    Tsai, Yi-Hsuan
    Shen, Xiaohui
    Lin, Zhe
    Sunkavalli, Kalyan
    Lu, Xin
    Yang, Ming-Hsuan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2650 - 2658
  • [7] Scene Parsing via Dense Recurrent Neural Networks with Attentional Selection
    Fan, Heng
    Chu, Peng
    Latecki, Longin Jan
    Ling, Haibin
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1816 - 1825
  • [8] Graphical Contrastive Losses for Scene Graph Parsing
    Zhang, Ji
    Shih, Kevin J.
    Elgammal, Ahmed
    Tao, Andrew
    Catanzaro, Bryan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11527 - 11535
  • [9] Interaction via Bi-directional Graph of Semantic Region Affinity for Scene Parsing
    Ding, Henghui
    Zhang, Hui
    Liu, Jun
    Li, Jiaxin
    Feng, Zijian
    Jiang, Xudong
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15828 - 15838
  • [10] Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing
    Mi, Jinpeng
    Lyu, Jianzhi
    Tang, Song
    Li, Qingdu
    Zhang, Jianwei
    FRONTIERS IN NEUROROBOTICS, 2020, 14