Video-based spatio-temporal scene graph generation with efficient self-supervision tasks

被引:0
|
作者
Lianggangxu Chen
Yiqing Cai
Changhong Lu
Changbo Wang
Gaoqi He
机构
[1] Chongqing Institute of East China Normal University,Chongqing Key Laboratory of Precision Optics
[2] East China Normal University,School of Computer Science and Technology
[3] East China Normal University,School of Mathematical Sciences
来源
关键词
Spatio-temporal scene graphs generation; Self-supervision; Local relation-aware attention;
D O I
暂无
中图分类号
学科分类号
摘要
Spatio-temporal Scene Graphs Generation (STSGG) aims to extract a sequence of graph-based semantic representations for high-level visual tasks. Existing works often fail to exploit the strong temporal correlation and the details of local features, which leads to the inability to distinguish the action between dynamic relation (e.g., drinking) and static relation (e.g., holding). Furthermore, due to bad long-tailed bias, the prediction results are troubled by inaccurate tail predicates classifications. To address these issues, a slowfast local-aware attention (SFLA) Network is proposed for temporal modeling in STSGG. First, a two-branch network is used to extract static and dynamic relation features respectively. Second, a local relation-aware attention (LRA) module is proposed to attach higher importance to the crucial elements in the local relationship. Third, three novel self-supervision prediction tasks are proposed, that is, spatial location, human attention state, and distance variation. Such self-supervision tasks are trained simultaneously with the main model to alleviate the long-tailed bias problem and enhance feature discrimination. Systematic experiments show that our method achieves state-of-the-art performance in the recently proposed Action Genome (AG) dataset and the popular ImageNet Video dataset.
引用
收藏
页码:38947 / 38966
页数:19
相关论文
共 50 条
  • [41] Spatio-Temporal Graph Convolution Transformer for Video Question Answering
    Tang, Jiahao
    Hu, Jianguo
    Huang, Wenjun
    Shen, Shengzhi
    Pan, Jiakai
    Wang, Deming
    Ding, Yanyu
    IEEE ACCESS, 2024, 12 : 131664 - 131680
  • [42] Contrastive Language-Video Learning Model Based on Spatio-Temporal Information Auxiliary Supervision
    Zhang, Bing-Bing
    Zhang, Jian-Xin
    Li, Pei-Hua
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (08): : 1769 - 1785
  • [43] Spatio-temporal transform based video hashing
    Coskun, Baris
    Sankur, Bulent
    Memon, Nasir
    IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (06) : 1190 - 1208
  • [44] Video-based driver emotion recognition using hybrid deep spatio-temporal feature learning
    Varma, Harshit
    Ganapathy, Nagarajan
    Deserno, Thomas M.
    MEDICAL IMAGING 2022: IMAGING INFORMATICS FOR HEALTHCARE, RESEARCH, AND APPLICATIONS, 2022, 12037
  • [45] Video Saliency Detection Based On Robust Seeds Generation And Spatio-Temporal Propagation
    Tian, Kai
    Lu, Zongqing
    Liao, Qingmin
    Wang, Na
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [46] Spatio-Temporal Flame Modeling and Dynamic Texture Analysis for Automatic Video-Based Fire Detection
    Dimitropoulos, Kosmas
    Barmpoutis, Panagiotis
    Grammalidis, Nikos
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (02) : 339 - 351
  • [47] Video anomaly detection based on attention and efficient spatio-temporal feature extraction
    Rahimpour, Seyed Mohammad
    Kazemi, Mohammad
    Moallem, Payman
    Safayani, Mehran
    VISUAL COMPUTER, 2024, 40 (10): : 6825 - 6841
  • [48] Video Scene Graph Generation with Spatial-Temporal Knowledge
    Pu, Tao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9340 - 9344
  • [49] Efficient Spatio-Temporal Graph Neural Networks for Traffic Forecasting
    Lubarsky, Yackov
    Gaissinski, Alexei
    Kisilev, Pavel
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2023, PT II, 2023, 676 : 109 - 120
  • [50] A unified adaptive graph structure generation method for spatio-temporal graph forecasting
    Wang, Xu
    Lai, Nanjie
    Liu, Peiji
    Wang, Zongwei
    Gao, Min
    KNOWLEDGE-BASED SYSTEMS, 2025, 309