Informative Scene Graph Generation via Debiasing

被引:0
|
作者
Gao, Lianli [1 ]
Lyu, Xinyu [2 ]
Guo, Yuyu [1 ]
Hu, Yuxuan [3 ]
Li, Yuan-Fang [4 ]
Xu, Lu [5 ]
Shen, Heng Tao [6 ]
Song, Jingkuan [6 ]
机构
[1] Univ Elect Sci & Technol China, Shenzhen Inst Adv Study, Shenzhen, Peoples R China
[2] Southwestern Univ Finance & Econ, Chengdu, Peoples R China
[3] Southwest Univ, Chongqing, Peoples R China
[4] Monash Univ, Melbourne, Vic, Australia
[5] Kuaishou, Beijing, Peoples R China
[6] Tongji Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene graph generation; Visual relationship; Debaising; Information content; SEMANTIC SIMILARITY; ATTENTION;
D O I
10.1007/s11263-025-02365-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene graph generation aims to detect visual relationship triplets, (subject, predicate, object). Due to biases in data, current models tend to predict common predicates, e.g., "on" and "at", instead of informative ones, e.g., "standing on" and "looking at". This tendency results in the loss of precise information and overall performance. If a model only uses "stone on road" rather than "stone blocking road" to describe an image, it may be a grave misunderstanding. We argue that this phenomenon is caused by two imbalances: semantic space level imbalance and training sample level imbalance. For this problem, we propose DB-SGG, an effective framework based on debiasing but not the conventional distribution fitting. It integrates two components: Semantic Debiasing (SD) and Balanced Predicate Learning (BPL), for these imbalances. SD utilizes a confusion matrix and a bipartite graph to construct predicate relationships. BPL adopts a random undersampling strategy and an ambiguity removing strategy to focus on informative predicates. Benefiting from the model-agnostic process, our method can be easily applied to SGG models and outperforms Transformer by 136.3%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$136.3\%$$\end{document}, 119.5%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$119.5\%$$\end{document}, and 122.6%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$122.6\%$$\end{document} on mR@20 at three SGG sub-tasks on the SGG-VG dataset. Our method is further verified on another complex SGG dataset (SGG-GQA) and two downstream tasks (sentence-to-graph retrieval and image captioning).
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Heterogeneous Learning for Scene Graph Generation
    He, Yunqing
    Ren, Tongwei
    Tang, Jinhui
    Wu, Gangshan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4704 - 4713
  • [32] Exploring and Exploiting the Hierarchical Structure of a Scene for Scene Graph Generation
    Kurosawa, Ikuto
    Kobayashi, Tetsunori
    Hayashi, Yoshihiko
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1422 - 1429
  • [33] Graph-LSTM with Global Attribute for Scene Graph Generation
    Shao, Tong
    Wu, Dapeng Oliver
    Journal of Physics: Conference Series, 2021, 2003 (01)
  • [34] Dynamic Gated Graph Neural Networks for Scene Graph Generation
    Khademi, Mahmoud
    Schulte, Oliver
    COMPUTER VISION - ACCV 2018, PT VI, 2019, 11366 : 669 - 685
  • [35] Atom correlation based graph propagation for scene graph generation
    Lin, Bingqian
    Zhu, Yi
    Liang, Xiaodan
    PATTERN RECOGNITION, 2022, 122
  • [36] Debiasing Graph Neural Networks via Learning Disentangled Causal Substructure
    Fan, Shaohua
    Wang, Xiao
    Mo, Yanhu
    Shi, Chuan
    Tang, Jian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [37] Debiasing knowledge graph embeddings
    Fisher, Joseph
    Mittal, Arpit
    Palfrey, Dave
    Christodoulopoulos, Christos
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7332 - 7345
  • [38] RelTR: Relation Transformer for Scene Graph Generation
    Cong, Yuren
    Yang, Michael Ying
    Rosenhahn, Bodo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 11169 - 11183
  • [39] Boosting Scene Graph Generation with Contextual Information
    Sun, Shiqi
    Huang, Danlan
    Tao, Xiaoming
    Pan, Chengkang
    Liu, Guangyi
    Chen, Changwen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (02)
  • [40] CONTEXTUAL LABEL TRANSFORMATION FOR SCENE GRAPH GENERATION
    Lee, Wonhee
    Kim, Sungeun
    Kim, Gunhee
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2533 - 2537