Spatiotemporal Modeling for Video Summarization Using Convolutional Recurrent Neural Network

被引:30
|
作者
Yuan, Yuan [1 ,2 ]
Li, Haopeng [1 ,2 ]
Wang, Qi [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Xian 710072, Shaanxi, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
CRNN; CRSum; Sobolev loss; spatiotemporal modeling; video summarization;
D O I
10.1109/ACCESS.2019.2916989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel neural network named CRSum for the video summarization task is proposed. The proposed network integrates feature extraction, temporal modeling, and summary generation into an end-to-end architecture. Compared with previous work on this task, the proposed method owns three distinctive characteristics: 1) it for the first time leverages convolutional recurrent neural network for simultaneously modeling spatial and temporal structure of video for summarization; 2) thorough and delicate features of video are obtained in the proposed architecture by trainable three-dimension convolutional neural networks and feature fusion; and 3) a new loss function named Sobolev loss is defined, aiming to constrain the derivative of sequential data and exploit potential temporal structure of video. A series of experiments are conducted to prove the effectiveness of the proposed method. We further analyze our method from different aspects by well-designed experiments.
引用
收藏
页码:64676 / 64685
页数:10
相关论文
共 50 条
  • [41] Fighting behaviour detection in video using convolutional neural network
    Huang Y.
    Lai L.
    International Journal of Wireless and Mobile Computing, 2021, 21 (02) : 101 - 112
  • [42] A generalised framework for convolutional decoding using a Recurrent Neural Network
    Secker, PJ
    Berber, SM
    Salcic, ZA
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1502 - 1506
  • [43] Hypergraph Convolutional Recurrent Neural Network
    Yi, Jaehyuk
    Park, Jinkyoo
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3366 - 3376
  • [44] THE METHOD OF HYDRODYNAMIC MODELING USING A CONVOLUTIONAL NEURAL NETWORK
    Novotarskyi, M. A.
    Kuzmych, V. A.
    RADIO ELECTRONICS COMPUTER SCIENCE CONTROL, 2023, (04) : 58 - 68
  • [45] Video Summarization Using Fully Convolutional Sequence Networks
    Rochan, Mrigank
    Ye, Linwei
    Wang, Yang
    COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 358 - 374
  • [46] Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion
    Zhou, Xuan
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2021, 17 (02): : 337 - 351
  • [47] GCRINT: Network Traffic Imputation Using Graph Convolutional Recurrent Neural Network
    Van An Le
    Tien Thanh Le
    Phi Le Nguyen
    Huynh Thi Thanh Binh
    Akerkar, Rajendra
    Ji, Yusheng
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
  • [48] EEG-BASED VIDEO IDENTIFICATION USING GRAPH SIGNAL MODELING AND GRAPH CONVOLUTIONAL NEURAL NETWORK
    Jang, Soobeom
    Moon, Seong-Eun
    Lee, Jong-Seok
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 3066 - 3070
  • [49] Spatiotemporal two-stream LSTM network for unsupervised video summarization
    Min Hu
    Ruimin Hu
    Zhongyuan Wang
    Zixiang Xiong
    Rui Zhong
    Multimedia Tools and Applications, 2022, 81 : 40489 - 40510
  • [50] Spatiotemporal two-stream LSTM network for unsupervised video summarization
    Hu, Min
    Hu, Ruimin
    Wang, Zhongyuan
    Xiong, Zixiang
    Zhong, Rui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 40489 - 40510