CAPTURING LONG-RANGE DEPENDENCIES IN VIDEO CAPTIONING

被引:0
|
作者
Lee, Jaeyoung [1 ,3 ]
Lee, Yekang [1 ]
Seong, Sihyeon [1 ,3 ]
Kim, Kyungsu [2 ]
Kim, Sungjin [2 ]
Kim, Junmo [1 ,3 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] Samsung Res, Seoul, South Korea
[3] Mofl Inc, Seoul, South Korea
关键词
Video captioning; non-local block; long short-term memory; long-range dependency; video representation;
D O I
10.1109/icip.2019.8803143
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Most video captioning networks rely on recurrent models, including long short-term memory (LSTM). However, these recurrent models have a long-range dependency problem; thus, they are not sufficient for video encoding. To overcome this limitation, several studies investigated the relationships between objects or entities and have shown excellent performance in video classification and video captioning. In this study, we analyze a video captioning network with a non-local block in terms of temporal capacity. We introduce a video captioning method to capture long-range temporal dependencies with a non-local block. The proposed model independently uses local and non-local features. We evaluate our approach on a Microsoft Video Description Corpus (MSVD, YouTube2Text) dataset. The experimental results show that a non-local block applied along the temporal axis can solve the long-range dependency problem of the LSTM in video captioning datasets.
引用
收藏
页码:1880 / 1884
页数:5
相关论文
共 50 条
  • [31] Long-range sequential dependencies precede complex syntactic production in language acquisition
    Sainburg, Tim
    Mai, Anna
    Gentner, Timothy Q.
    PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2022, 289 (1970)
  • [32] Pancreatic cancer pathology image segmentation with channel and spatial long-range dependencies
    Chen, Zhao-Min
    Liao, Yifan
    Zhou, Xingjian
    Yu, Wenyao
    Zhang, Guodao
    Ge, Yisu
    Ke, Tan
    Shi, Keqing
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [33] BIM Product Style Classification and Retrieval Based on Long-Range Style Dependencies
    Cui, Jia
    Zang, Mengwei
    Liu, Zhen
    Qi, Meng
    Luo, Rong
    Gu, Zhenyu
    Lu, Hongju
    BUILDINGS, 2023, 13 (09)
  • [34] ABSENCE OF LONG-RANGE ORDER WITH LONG-RANGE POTENTIALS
    BAUS, M
    JOURNAL OF STATISTICAL PHYSICS, 1980, 22 (01) : 111 - 119
  • [35] EFFICIENT KEYWORD SPOTTING BY CAPTURING LONG-RANGE INTERACTIONS WITH TEMPORAL LAMBDA NETWORKS
    Tura, Biel
    Escuder, Santiago
    Diego, Ferran
    Segura, Carlos
    Luque, Jordi
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 146 - 153
  • [36] Capturing native long-range contiguity by in situ library construction and optical sequencing
    Schwartz, Jerrod J.
    Lee, Choli
    Hiatt, Joseph B.
    Adey, Andrew
    Shendure, Jay
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (46) : 18749 - 18754
  • [37] Capturing Long-Range Memory Structures with Tree-Geometry Process Tensors
    Dowling, Neil
    Modi, Kavan
    Munoz, Roberto N.
    Singh, Sukhbinder
    White, Gregory A. L.
    PHYSICAL REVIEW X, 2024, 14 (04):
  • [38] Optimizing Quality of Experience for Long-Range UAS Video Streaming
    Shirey, Russell
    Rao, Sanjay
    Sundaram, Shreyas
    2021 IEEE/ACM 29TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2021,
  • [39] Shaken, and Stirred: Long-Range Dependencies Enable Robust Outlier Detection with PixelCNN plus
    Umapathi, Barath Mohan
    Chauhan, Kushal
    Shenoy, Pradeep
    Sridharan, Devarajan
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1440 - 1450
  • [40] BlockEcho: Retaining Long-Range Dependencies for Imputing Block-Wise Missing Data
    Han, Qiao
    Li, Mingqian
    Yang, Yao
    Zhai, Yiteng
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 4098 - 4106