Video Summarization Based on Multimodal Features

被引:0
|
作者
Zhang, Yu [1 ]
Liu, Ju [2 ]
Liu, Xiaoxi [1 ]
Gao, Xuesong [3 ]
机构
[1] Shandong Univ, Informat & Commun Engn, Qingdao, Peoples R China
[2] Shandong Univ, Dept Elect Engn, Qingdao, Peoples R China
[3] Hisense Grp, Qingdao, Peoples R China
关键词
Feature Fusion; Information Science; LSTM; Multimedia Processing; Multimodal Features; Video Summarization;
D O I
10.4018/IJMDEM.2020100104
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this manuscript, the authors present a keyshots-based supervised video summarization method, where feature fusion and LSTM networks are used for summarization. The framework can be divided into three folds: 1) The authors formulate video summarization as a sequence to sequence problem, which should predict the importance score of video content based on video feature sequence. 2) By simultaneously considering visual features and textual features, the authors present the deep fusion multimodal features and summarize videos based on recurrent encoder-decoder architecture with bi-directional LSTM. 3) Most importantly, in order to train the supervised video summarization framework, the authors adopt the number of users who decided to select current video clip in their final video summary as the importance scores and ground truth. Comparisons are performed with the state-of-the-art methods and different variants of FLSum and T-FLSum. The results of F-score and rank correlation coefficients on TVSum and SumMe shows the outstanding performance of the method proposed in this manuscript.
引用
收藏
页码:60 / 76
页数:17
相关论文
共 50 条
  • [1] Multimodal Video Summarization based on Fuzzy Similarity Features
    Psallidas, Theodoros
    Vasilakakis, Michael D.
    Spyrou, Evaggelos
    Iakovidis, Dimitris K.
    2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
  • [2] VIDEO SUMMARIZATION BASED ON LOCAL FEATURES
    Massaoudi, Mohamed
    Bahroun, Sahbi
    Zagrouba, Ezzeddine
    25. INTERNATIONAL CONFERENCE IN CENTRAL EUROPE ON COMPUTER GRAPHICS, VISUALIZATION AND COMPUTER VISION (WSCG 2017), 2017, 2701 : 13 - 17
  • [3] Interactive System for Video Summarization Based on Multimodal Fusion
    Zheng Li
    Xiaobing Du
    Cuixia Ma
    Yanfeng Li
    Hongan Wang
    Journal of Beijing Institute of Technology, 2019, 28 (01) : 27 - 34
  • [4] Interactive System for Video Summarization Based on Multimodal Fusion
    Li Z.
    Du X.
    Ma C.
    Li Y.
    Wang H.
    Journal of Beijing Institute of Technology (English Edition), 2019, 28 (01): : 27 - 34
  • [5] An Unsupervised Video Summarization Method Based on Multimodal Representation
    Lei, Zhuo
    Yu, Qiang
    Shou, Lidan
    Li, Shengquan
    Mao, Yunqing
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT V, 2023, 14090 : 171 - 180
  • [6] Online Video Summarization Based on Local Features
    Iparraguirre, Javier
    Delrieux, Claudio A.
    INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2014, 5 (02): : 41 - 53
  • [7] A Knowledge Augmented and Multimodal-Based Framework for Video Summarization
    Xie, Jiehang
    Chen, Xuanbai
    Lu, Shao-Ping
    Yang, Yulu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [8] MLASK: Multimodal Summarization of Video-based News Articles
    Krubinski, Mateusz
    Pecina, Pavel
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 910 - 924
  • [9] Multimodal-Based and Aesthetic-Guided Narrative Video Summarization
    Xie, Jiehang
    Chen, Xuanbai
    Zhang, Tianyi
    Zhang, Yixuan
    Lu, Shao-Ping
    Cesar, Pablo
    Yang, Yulu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4894 - 4908
  • [10] Multimodal summarization with modality features alignment and features filtering
    Tang, Binghao
    Lin, Boda
    Chang, Zheng
    Li, Si
    NEUROCOMPUTING, 2024, 603