Accurate video saliency prediction via hierarchical fusion and temporal recurrence

被引:2
|
作者
Zhang, Yunzuo [1 ]
Zhang, Tian [1 ]
Wu, Cunyu [1 ]
Zheng, Yuxin [1 ]
机构
[1] Shijiazhuang Tiedao Univ, Sch Informat Sci & Technol, Shijiazhuang 050043, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
Video saliency prediction; Hierarchical spatiotemporal feature; Temporal recurrence; 3D convolutional network; Attention mechanism; CONVOLUTIONAL NETWORKS; NEURAL-NETWORK; MODEL; EYE;
D O I
10.1016/j.imavis.2023.104744
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the ability to extract spatiotemporal features, 3D convolutional networks have become the mainstream method for Video Saliency Prediction (VSP). However, these methods cannot make full use of hierarchical spatio-temporal features and also lack focus on past salient features, which hinders further improvements in accuracy. To address these issues, we propose a 3D convolutional Network based on Hierarchical Fusion and Temporal Re-currence (HFTR-Net) for VSP. Specifically, we propose a Bi-directional Temporal-Spatial Feature Pyramid (BiTSFP), which adds the flow of shallow location information based on the previous flow of deep semantic infor-mation. Then, different from simple addition and concatenation, we design a Hierarchical Adaptive Fusion (HAF) mechanism that can adaptively learn the fusion weights of adjacent features to integrate them appropriately. Moreover, to utilize previous salient information, a Recall 3D convGRU (R3D GRU) module is integrated into the 3D convolution-based method for the first time. It subtly combines the local feature extraction of the 3D back-bone with the long-term relationship modeling of the temporal recurrence mechanism. Experimental results on the three common datasets demonstrate that the HFTR-Net outperforms existing state-of-the-art methods in accuracy.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Fusion of Hierarchical Optimization Models for Accurate Power Load Prediction
    Wan, Sicheng
    Wang, Yibo
    Zhang, Youshuang
    Zhu, Beibei
    Huang, Huakun
    Liu, Jia
    SUSTAINABILITY, 2024, 16 (16)
  • [22] Dynamic Saliency Detection via CNN and Spatial-temporal Fusion
    Qi, Zhang
    Dong, Xu
    TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018), 2018, 10806
  • [23] Multi-Scale Spatiotemporal Feature Fusion Network for Video Saliency Prediction
    Zhang, Yunzuo
    Zhang, Tian
    Wu, Cunyu
    Tao, Ran
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4183 - 4193
  • [24] Spatio-Temporal Self-Attention Network for Video Saliency Prediction
    Wang, Ziqiang
    Liu, Zhi
    Li, Gongyang
    Wang, Yang
    Zhang, Tianhong
    Xu, Lihua
    Wang, Jijun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1161 - 1174
  • [25] TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation
    Hu, Feiyan
    Palazzo, Simone
    Salanitri, Federica Proietto
    Bellitto, Giovanni
    Moradi, Morteza
    Spampinato, Concetto
    McGuinness, Kevin
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2050 - 2059
  • [26] Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
    Lv, Ying
    Zhou, Wujie
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2020, 2020 (2020)
  • [27] Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network
    Zhang, Kao
    Chen, Zhenzhong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (12) : 3544 - 3557
  • [28] Fixation Analysis for Video Saliency Prediction
    Ikenoya R.
    Ohashi G.
    IEEJ Transactions on Electronics, Information and Systems, 2023, 143 (09) : 885 - 894
  • [29] Video saliency detection via bagging-based prediction and spatiotemporal propagation
    Zhou, Xiaofei
    Liu, Zhi
    Li, Kai
    Sun, Guangling
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 51 : 131 - 143
  • [30] Deep fusion based video saliency detection
    Wen, Hongfa
    Zhou, Xiaofei
    Sun, Yaoqi
    Zhang, Jiyong
    Yan, Chenggang
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 62 : 279 - 285