Spatiotemporal Modeling for Video Summarization Using Convolutional Recurrent Neural Network

被引:30
|
作者
Yuan, Yuan [1 ,2 ]
Li, Haopeng [1 ,2 ]
Wang, Qi [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Xian 710072, Shaanxi, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
CRNN; CRSum; Sobolev loss; spatiotemporal modeling; video summarization;
D O I
10.1109/ACCESS.2019.2916989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel neural network named CRSum for the video summarization task is proposed. The proposed network integrates feature extraction, temporal modeling, and summary generation into an end-to-end architecture. Compared with previous work on this task, the proposed method owns three distinctive characteristics: 1) it for the first time leverages convolutional recurrent neural network for simultaneously modeling spatial and temporal structure of video for summarization; 2) thorough and delicate features of video are obtained in the proposed architecture by trainable three-dimension convolutional neural networks and feature fusion; and 3) a new loss function named Sobolev loss is defined, aiming to constrain the derivative of sequential data and exploit potential temporal structure of video. A series of experiments are conducted to prove the effectiveness of the proposed method. We further analyze our method from different aspects by well-designed experiments.
引用
收藏
页码:64676 / 64685
页数:10
相关论文
共 50 条
  • [1] Hierarchical Recurrent Neural Network for Video Summarization
    Zhao, Bin
    Li, Xuelong
    Lu, Xiaoqiang
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 863 - 871
  • [2] Video Summarization using Convolutional Neural Network and Random Forest Classifier
    Nair, Madhu S.
    Mohan, Jesna
    PROCEEDINGS OF THE 2019 IEEE REGION 10 CONFERENCE (TENCON 2019): TECHNOLOGY, KNOWLEDGE, AND SOCIETY, 2019, : 476 - 480
  • [3] RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR VIDEO CLASSIFICATION
    Xu, Zhenqi
    Hu, Jiani
    Deng, Weihong
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [4] CATEGORY DRIVEN DEEP RECURRENT NEURAL NETWORK FOR VIDEO SUMMARIZATION
    Song, Xinhui
    Chen, Ke
    Lei, Jie
    Sun, Li
    Wang, Zhiyuan
    Xie, Lei
    Song, Mingli
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2016,
  • [5] Development of Spatiotemporal Recurrent Neural Network for Modeling of Spatiotemporal Processes
    Lu, Xinjiang
    Xu, Du
    Liu, Wenbo
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (01) : 189 - 198
  • [6] Foveated convolutional neural networks for video summarization
    Jiaxin Wu
    Sheng-hua Zhong
    Zheng Ma
    Stephen J. Heinen
    Jianmin Jiang
    Multimedia Tools and Applications, 2018, 77 : 29245 - 29267
  • [7] Foveated convolutional neural networks for video summarization
    Wu, Jiaxin
    Zhong, Sheng-hua
    Ma, Zheng
    Heinen, Stephen J.
    Jiang, Jianmin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29245 - 29267
  • [8] Human Activity Recognition Based On Video Summarization And Deep Convolutional Neural Network
    Kushwaha, Arati
    Khare, Manish
    Bommisetty, Reddy Mounika
    Khare, Ashish
    Computer Journal, 1600, 67 (08): : 2601 - 2609
  • [9] Human Activity Recognition Based On Video Summarization And Deep Convolutional Neural Network
    Kushwaha, Arati
    Khare, Manish
    Bommisetty, Reddy Mounika
    Khare, Ashish
    COMPUTER JOURNAL, 2024,
  • [10] Controlling Length in Abstractive Summarization Using a Convolutional Neural Network
    Liu, Yizhu
    Luo, Zhiyi
    Zhu, Kenny Q.
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4110 - 4119