Video-based spatio-temporal scene graph generation with efficient self-supervision tasks

被引：0

作者：

Lianggangxu Chen

Yiqing Cai

Changhong Lu

Changbo Wang

Gaoqi He

机构：

[1] Chongqing Institute of East China Normal University,Chongqing Key Laboratory of Precision Optics

[2] East China Normal University,School of Computer Science and Technology

[3] East China Normal University,School of Mathematical Sciences

来源：

Multimedia Tools and Applications | 2023年 / 82卷

关键词：

Spatio-temporal scene graphs generation; Self-supervision; Local relation-aware attention;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Spatio-temporal Scene Graphs Generation (STSGG) aims to extract a sequence of graph-based semantic representations for high-level visual tasks. Existing works often fail to exploit the strong temporal correlation and the details of local features, which leads to the inability to distinguish the action between dynamic relation (e.g., drinking) and static relation (e.g., holding). Furthermore, due to bad long-tailed bias, the prediction results are troubled by inaccurate tail predicates classifications. To address these issues, a slowfast local-aware attention (SFLA) Network is proposed for temporal modeling in STSGG. First, a two-branch network is used to extract static and dynamic relation features respectively. Second, a local relation-aware attention (LRA) module is proposed to attach higher importance to the crucial elements in the local relationship. Third, three novel self-supervision prediction tasks are proposed, that is, spatial location, human attention state, and distance variation. Such self-supervision tasks are trained simultaneously with the main model to alleviate the long-tailed bias problem and enhance feature discrimination. Systematic experiments show that our method achieves state-of-the-art performance in the recently proposed Action Genome (AG) dataset and the popular ImageNet Video dataset.

引用

页码：38947 / 38966

页数：19

共 50 条

[31] Random Generation of a Locally Consistent Spatio-Temporal Graph
Leborgne, Aurelie
Kirandjiska, Marija
Le Ber, Florence
GRAPH-BASED REPRESENTATION AND REASONING (ICCS 2021), 2021, 12879 : 155 - 169
[32] Learning dual disentangled representation with self-supervision for temporal knowledge graph reasoning
Xiao, Yao
Zhou, Guangyou
Xie, Zhiwen
Liu, Jin
Huang, Jimmy Xiangji
INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
[33] Video Synopsis Generation Using Spatio-Temporal Groups
Ahmed, A.
Kar, S.
Dogra, D. P.
Patnaik, R.
Lee, S.
Choi, H.
Kim, I.
2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS (ICSIPA), 2017, : 512 - 517
[34] Video Generation for High Spatio-temporal Resolution Imaging
Imagawa, T.
Azuma, T.
Nobori, K.
Motomura, H.
2009 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2009, : 151 - 152
[35] ImaGINator: Conditional Spatio-Temporal GAN for Video Generation
Wang, Yaohui
Bilinski, Piotr
Bremond, Francois
Dantcheva, Antitza
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1149 - 1158
[36] Scene Spatio-Temporal Graph Convolutional Network for Pedestrian Intention Estimation
Naik, Abhilash Y.
Bighashdel, Ariyan
Jancura, Pavol
Dubbelman, Gijs
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 874 - 881
[37] Video Anomaly Detection via self-supervised and spatio-temporal proxy tasks learning
Yang, Qingyang
Wang, Chuanxu
Liu, Peng
Jiang, Zitai
Li, Jiajiong
PATTERN RECOGNITION, 2025, 158
[38] Efficient probabilistic spatio-temporal video object segmentation
Ahmed, Rakib
Karmakar, Gour C.
Dooley, Laurence S.
6TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE, PROCEEDINGS, 2007, : 807 - +
[39] An efficient approach for video retrieval by spatio-temporal features
Kumar, G. S. Naveen
Reddy, V. S. K.
INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2019, 23 (04) : 311 - 316
[40] A self-supervised spatio-temporal attention network for video-based 3D infant pose estimation
Yin, Wang
Chen, Linxi
Huang, Xinrui
Huang, Chunling
Wang, Zhaohong
Bian, Yang
Wan, You
Zhou, Yuan
Han, Tongyan
Yi, Ming
MEDICAL IMAGE ANALYSIS, 2024, 96

← 1 2 3 4 5 →