Attention-Based Multi-Layered Encoder-Decoder Model for Summarizing Non-Interactive User-Based Videos

被引:0
|
作者
Tiwari, Vasudha [1 ]
Bhatnagar, Charul [1 ]
机构
[1] GLA Univ, Dept CEA, Mathura, India
关键词
Multi-layered encoder-decoder; video summarization; attention; BiLSTM; LSTM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video summarization extracts the relevant contents from a video and presents the entire content of the video in a compact and summarized form. User based video summarization, can summarize a video as per the requirement of the user. In this work, a non interactive and a perception-based video summarization technique is proposed that makes use of attention mechanism to capture user's interest and extract relevant keyshots in temporal sequence from the video content. Here, video summarization has been articulated as a sequence-to-sequence learning problem and a supervised method has been proposed for summarization of the video. Adding layers to the existing network makes it deeper, enables higher level of abstraction and facilitates better feature extraction. Therefore, the proposed model uses a multi-layered, deep summarization encoder-decoder network (MLAVS), with attention mechanism to select final keyshots from the video. The contextual information of the video frames is encoded using a multi-layered Bidirectional Long Short-Term Memory network (BiLSTM) as the encoder. To decode, a multi-layered attention-based Long Short-Term memory (LSTM) using a multiplicative score function is employed. The experiments are performed on the benchmark TVSum dataset and the results obtained are compared with recent works. The results show considerable improvement and clearly demonstrate the efficacy of this methodology against most of the other available state-of-art methods.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [41] Daily multistep soil moisture forecasting by combining linear and nonlinear causality and attention-based encoder-decoder model
    Xu, Lei
    Lv, Yu
    Moradkhani, Hamid
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2024, 38 (12) : 4979 - 5000
  • [42] Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASR
    Liu, Yufei
    Ma, Rao
    Xu, Haihua
    He, Yi
    Ma, Zejun
    Zhang, Weibin
    INTERSPEECH 2022, 2022, : 1666 - 1670
  • [43] Attention-based encoder-decoder networks for state of charge estimation of lithium-ion battery
    Wu, Lifeng
    Zhang, Yu
    ENERGY, 2023, 268
  • [44] Recognition of Japanese historical text lines by an attention-based encoder-decoder and text line generation
    Le, Anh Duc
    Mochihashi, Daichi
    Masuda, Katsuya
    Mima, Hideki
    Ly, Nam Tuan
    PROCEEDINGS OF THE 2019 WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING (HIP' 19), 2019, : 37 - 41
  • [45] Multi-Task Learning Using Attention-Based Convolutional Encoder-Decoder for Dilated Cardiomyopathy CMR Segmentation and Classification
    Luo, Chao
    Shi, Canghong
    Li, Xiaojie
    Wang, Xin
    Chen, Yucheng
    Gao, Dongrui
    Yin, Youbing
    Song, Qi
    Wu, Xi
    Zhou, Jiliu
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 63 (02): : 995 - 1012
  • [46] CloudRaednet: residual attention-based encoder-decoder network for ground-based cloud images segmentation in nychthemeron
    Shi, Chaojun
    Zhou, Yatong
    Qiu, Bo
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (06) : 2059 - 2075
  • [47] Multi-task Learning with Augmentation Strategy for Acoustic-to-word Attention-based Encoder-decoder Speech Recognition
    Moriya, Takafumi
    Ueno, Sei
    Shinohara, Yusuke
    Delcroix, Marc
    Yamaguchi, Yoshikazu
    Aono, Yushi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2399 - 2403
  • [48] Multi-task prediction model based on ConvLSTM and encoder-decoder
    Luo, Tao
    Cao, Xudong
    Li, Jin
    Dong, Kun
    Zhang, Rui
    Wei, Xueliang
    INTELLIGENT DATA ANALYSIS, 2021, 25 (02) : 359 - 382
  • [49] An encoder-decoder network for crowd counting based on multi-scale attention mechanism
    Chuang H.-H.
    Chen Y.-C.
    Lin C.H.
    Multimedia Tools and Applications, 2025, 84 (03) : 1187 - 1210
  • [50] A novel approach to workload prediction using attention-based LSTM encoder-decoder network in cloud environment
    Zhu, Yonghua
    Zhang, Weilin
    Chen, Yihai
    Gao, Honghao
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2019, 2019 (01)