Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

被引:15
|
作者
Xu, Li [1 ]
Qu, Haoxuan [1 ]
Kuen, Jason [2 ]
Gu, Jiuxiang [2 ]
Liu, Jun [1 ]
机构
[1] Singapore Univ Technol & Design, Singapore, Singapore
[2] Adobe Res, San Jose, CA USA
来源
基金
新加坡国家研究基金会;
关键词
VidSGG; Long-tailed bias; Meta learning;
D O I
10.1007/978-3-031-19812-0_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.
引用
收藏
页码:374 / 390
页数:17
相关论文
共 50 条
  • [41] Spatio-temporal querying in video databases
    Koprulu, M
    Cicekli, NK
    Yazici, A
    INFORMATION SCIENCES, 2004, 160 (1-4) : 131 - 152
  • [42] Panoptic Video Scene Graph Generation
    Yang, Jingkang
    Peng, Wenxuan
    Li, Xiangtai
    Guo, Zujin
    Chen, Liangyu
    Li, Bo
    Ma, Zheng
    Zhou, Kaiyang
    Zhang, Wayne
    Loy, Chen Change
    Liu, Ziwei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18675 - 18685
  • [43] SPATIO-TEMPORAL VIDEO FILTERING FOR VIDEO SURVEILLANCE APPLICATIONS
    Ben Hamida, Amal
    Koubaa, Mohamed
    Nicolas, Henri
    Ben Amar, Chokri
    ELECTRONIC PROCEEDINGS OF THE 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2013,
  • [44] VirtualHome Action Genome: A Simulated Spatio-Temporal Scene Graph Dataset with Consistent Relationship Labels
    Qiu, Yue
    Nagasaki, Yoshiki
    Hara, Kensho
    Kataoka, Hirokatsu
    Suzuki, Ryota
    Iwata, Kenji
    Satoh, Yutaka
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3340 - 3349
  • [45] End-to-End Video Scene Graph Generation With Temporal Propagation Transformer
    Zhang, Yong
    Pan, Yingwei
    Yao, Ting
    Huang, Rui
    Mei, Tao
    Chen, Chang-Wen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1613 - 1625
  • [46] Spatial–Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
    Pu, Tao
    Chen, Tianshui
    Wu, Hefeng
    Lu, Yongyi
    Lin, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 556 - 568
  • [47] STRG-QL: Spatio-temporal region graph query language for video databases
    Lee, Jeongkyu
    Celebi, M. Emre
    MULTIMEDIA CONTENT ACCESS: ALGORITHMS AND SYSTEMS II, 2008, 6820
  • [48] Keyword-Aware Relative Spatio-Temporal Graph Networks for Video Question Answering
    Cheng, Yi
    Fan, Hehe
    Lin, Dongyun
    Sun, Ying
    Kankanhalli, Mohan
    Lim, Joo-Hwee
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6131 - 6141
  • [49] Video Saliency Detection Based On Robust Seeds Generation And Spatio-Temporal Propagation
    Tian, Kai
    Lu, Zongqing
    Liao, Qingmin
    Wang, Na
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [50] Cross-Attentional Spatio-Temporal Semantic Graph Networks for Video Question Answering
    Liu, Yun
    Zhang, Xiaoming
    Huang, Feiran
    Zhang, Bo
    Li, Zhoujun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1684 - 1696