Efficient Video Captioning with Frame Similarity-Based Filtering

被引:0
|
作者
Rashno, Elyas [1 ]
Zulkernine, Farhana [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON, Canada
关键词
Video Caption Generation; Video frame similarity; Sequence to Sequence; Stacked LSTM;
D O I
10.1007/978-3-031-39821-6_7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Video captioning combines computer vision and Natural Language Processing (NLP) to perform the challenging task of scene understanding. The rapid advancements in artificial intelligence have led to a growing interest in video captioning, which involves generating natural language descriptions based on the visual content of videos. In this paper, we present a novel approach to video caption generation. The proposed method first extracts frames from the video and reduces the number of frames based on their similarity. The remaining frames are then processed by a Convolution Neural Network (CNN) to extract a feature vector, which is then fed into a Long Short-Term Memory (LSTM) network to generate the captions. The results are compared with the state-of-the-art models which demonstrate that the proposed approach outperforms the existing methods on MSVD, M-VAD, and MPII-MD datasets.
引用
收藏
页码:98 / 112
页数:15
相关论文
共 50 条
  • [1] Structural similarity-based video fingerprinting for video copy detection
    Nie, Xiushan
    Zeng, Wenjun
    Yan, Hua
    Sun, Jiande
    Liu, Zheng
    Wang, Qian
    IET IMAGE PROCESSING, 2014, 8 (11) : 655 - 661
  • [2] Cognitive Similarity-Based Collaborative Filtering Recommendation System
    Nguyen, Luong Vuong
    Hong, Min-Sung
    Jung, Jason J.
    Sohn, Bong-Soo
    APPLIED SCIENCES-BASEL, 2020, 10 (12):
  • [3] Efficient similarity-based operations for data integration
    Schallehn, E
    Sattler, KU
    Saake, G
    DATA & KNOWLEDGE ENGINEERING, 2004, 48 (03) : 361 - 387
  • [4] Similarity-based motion track management for video retrieval
    Chen, Pei-Yi
    Chen, Arbee L. P.
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2006, 22 (06) : 1519 - 1527
  • [5] Structural similarity-based object tracking in video sequences
    Loza, Artur
    Mihaylova, Lyudmila
    Canagarajah, Nishan
    Bull, David
    2006 9TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOLS 1-4, 2006, : 95 - 100
  • [6] An efficient similarity-based approach for comparing XML documents
    Oliveira, Alessandreia
    Tessarolli, Gabriel
    Ghiotto, Gleiph
    Pinto, Bruno
    Campello, Fernando
    Marques, Matheus
    Oliveira, Carlos
    Rodrigues, Igor
    Kalinowski, Marcos
    Souza, Ueverton
    Murta, Leonardo
    Braganholo, Vanessa
    INFORMATION SYSTEMS, 2018, 78 : 40 - 57
  • [7] Seeking a safe and efficient similarity-based unfolding rule
    Julian-Iranzo, Pascual
    Moreno, Gines
    Riaza, Jose Antonio
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2023, 163
  • [8] MULTILATERAL FILTERING: A NOVEL FRAMEWORK FOR GENERIC SIMILARITY-BASED IMAGE DENOISING
    Butt, Irfan T.
    Rajpoot, Nasir M.
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 2981 - 2984
  • [9] Incorporating Textual Similarity in Video Captioning Schemes
    Gkountakos, Konstantinos
    Dimou, Anastasios
    Papadopoulos, Georgios Th.
    Daras, Petros
    2019 IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND INNOVATION (ICE/ITMC), 2019,
  • [10] Semantic similarity information discrimination for video captioning
    Du, Sen
    Zhu, Hong
    Xiong, Ge
    Lin, Guangfeng
    Wang, Dong
    Shi, Jing
    Wang, Jing
    Xing, Nan
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213