Video Summarization Based on Multimodal Features

被引：0

作者：

Zhang, Yu ^{[1
]}

Liu, Ju ^{[2
]}

Liu, Xiaoxi ^{[1
]}

Gao, Xuesong ^{[3
]}

机构：

[1] Shandong Univ, Informat & Commun Engn, Qingdao, Peoples R China

[2] Shandong Univ, Dept Elect Engn, Qingdao, Peoples R China

[3] Hisense Grp, Qingdao, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT | 2020年 / 11卷 / 04期

关键词：

Feature Fusion; Information Science; LSTM; Multimedia Processing; Multimodal Features; Video Summarization;

D O I：

10.4018/IJMDEM.2020100104

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this manuscript, the authors present a keyshots-based supervised video summarization method, where feature fusion and LSTM networks are used for summarization. The framework can be divided into three folds: 1) The authors formulate video summarization as a sequence to sequence problem, which should predict the importance score of video content based on video feature sequence. 2) By simultaneously considering visual features and textual features, the authors present the deep fusion multimodal features and summarize videos based on recurrent encoder-decoder architecture with bi-directional LSTM. 3) Most importantly, in order to train the supervised video summarization framework, the authors adopt the number of users who decided to select current video clip in their final video summary as the importance scores and ground truth. Comparisons are performed with the state-of-the-art methods and different variants of FLSum and T-FLSum. The results of F-score and rank correlation coefficients on TVSum and SumMe shows the outstanding performance of the method proposed in this manuscript.

引用

页码：60 / 76

页数：17

共 50 条

[1] Multimodal Video Summarization based on Fuzzy Similarity Features
Psallidas, Theodoros
Vasilakakis, Michael D.
Spyrou, Evaggelos
Iakovidis, Dimitris K.
2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
[2] VIDEO SUMMARIZATION BASED ON LOCAL FEATURES
Massaoudi, Mohamed
Bahroun, Sahbi
Zagrouba, Ezzeddine
25. INTERNATIONAL CONFERENCE IN CENTRAL EUROPE ON COMPUTER GRAPHICS, VISUALIZATION AND COMPUTER VISION (WSCG 2017), 2017, 2701 : 13 - 17
[3] Interactive System for Video Summarization Based on Multimodal Fusion
Zheng Li
Xiaobing Du
Cuixia Ma
Yanfeng Li
Hongan Wang
Journal of Beijing Institute of Technology, 2019, 28 (01) : 27 - 34
[4] Interactive System for Video Summarization Based on Multimodal Fusion
Li Z.
Du X.
Ma C.
Li Y.
Wang H.
Journal of Beijing Institute of Technology (English Edition), 2019, 28 (01): : 27 - 34
[5] An Unsupervised Video Summarization Method Based on Multimodal Representation
Lei, Zhuo
Yu, Qiang
Shou, Lidan
Li, Shengquan
Mao, Yunqing
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT V, 2023, 14090 : 171 - 180
[6] Online Video Summarization Based on Local Features
Iparraguirre, Javier
Delrieux, Claudio A.
INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2014, 5 (02): : 41 - 53
[7] A Knowledge Augmented and Multimodal-Based Framework for Video Summarization
Xie, Jiehang
Chen, Xuanbai
Lu, Shao-Ping
Yang, Yulu
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
[8] MLASK: Multimodal Summarization of Video-based News Articles
Krubinski, Mateusz
Pecina, Pavel
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 910 - 924
[9] Multimodal-Based and Aesthetic-Guided Narrative Video Summarization
Xie, Jiehang
Chen, Xuanbai
Zhang, Tianyi
Zhang, Yixuan
Lu, Shao-Ping
Cesar, Pablo
Yang, Yulu
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4894 - 4908
[10] Multimodal summarization with modality features alignment and features filtering
Tang, Binghao
Lin, Boda
Chang, Zheng
Li, Si
NEUROCOMPUTING, 2024, 603

← 1 2 3 4 5 →