Accurate video saliency prediction via hierarchical fusion and temporal recurrence

被引:2
|
作者
Zhang, Yunzuo [1 ]
Zhang, Tian [1 ]
Wu, Cunyu [1 ]
Zheng, Yuxin [1 ]
机构
[1] Shijiazhuang Tiedao Univ, Sch Informat Sci & Technol, Shijiazhuang 050043, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
Video saliency prediction; Hierarchical spatiotemporal feature; Temporal recurrence; 3D convolutional network; Attention mechanism; CONVOLUTIONAL NETWORKS; NEURAL-NETWORK; MODEL; EYE;
D O I
10.1016/j.imavis.2023.104744
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the ability to extract spatiotemporal features, 3D convolutional networks have become the mainstream method for Video Saliency Prediction (VSP). However, these methods cannot make full use of hierarchical spatio-temporal features and also lack focus on past salient features, which hinders further improvements in accuracy. To address these issues, we propose a 3D convolutional Network based on Hierarchical Fusion and Temporal Re-currence (HFTR-Net) for VSP. Specifically, we propose a Bi-directional Temporal-Spatial Feature Pyramid (BiTSFP), which adds the flow of shallow location information based on the previous flow of deep semantic infor-mation. Then, different from simple addition and concatenation, we design a Hierarchical Adaptive Fusion (HAF) mechanism that can adaptively learn the fusion weights of adjacent features to integrate them appropriately. Moreover, to utilize previous salient information, a Recall 3D convGRU (R3D GRU) module is integrated into the 3D convolution-based method for the first time. It subtly combines the local feature extraction of the 3D back-bone with the long-term relationship modeling of the temporal recurrence mechanism. Experimental results on the three common datasets demonstrate that the HFTR-Net outperforms existing state-of-the-art methods in accuracy.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Video saliency detection based on low-level saliency fusion and saliency-aware geodesic
    Li, Weisheng
    Feng, Siqin
    Guan, Hua-Ping
    Zhan, Ziwei
    Gong, Cheng
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (01)
  • [42] Perception-oriented video saliency detection via spatio-temporal attention analysis
    Zhong, Sheng-hua
    Liu, Yan
    Ng, To-Yee
    Liu, Yang
    NEUROCOMPUTING, 2016, 207 : 178 - 188
  • [43] Fusion hierarchy motion feature for video saliency detection
    Fen Xiao
    Huiyu Luo
    Wenlei Zhang
    Zhen Li
    Xieping Gao
    Multimedia Tools and Applications, 2024, 83 : 32301 - 32320
  • [44] Fusion hierarchy motion feature for video saliency detection
    Xiao, Fen
    Luo, Huiyu
    Zhang, Wenlei
    Li, Zhen
    Gao, Xieping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 32301 - 32320
  • [45] A Gated Fusion Network for Dynamic Saliency Prediction
    Kocak, Aysun
    Erdem, Erkut
    Erdem, Aykut
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (03) : 995 - 1008
  • [46] Temporal Saliency Query Network for Efficient Video Recognition
    Xia, Boyang
    Wang, Zhihao
    Wu, Wenhao
    Wang, Haoran
    Han, Jungong
    COMPUTER VISION, ECCV 2022, PT XXXIV, 2022, 13694 : 741 - 759
  • [47] Temporal color video demosaicking via motion estimation and data fusion
    Wu, XL
    Zhang, L
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2006, 16 (02) : 231 - 240
  • [48] TSI: Temporal saliency integration for video action recognition
    SenseTime Research
    不详
    不详
    不详
    arXiv, 1600,
  • [49] Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering
    Shaoning Xiao
    Yimeng Li
    Yunan Ye
    Long Chen
    Shiliang Pu
    Zhou Zhao
    Jian Shao
    Jun Xiao
    Neural Processing Letters, 2020, 52 : 993 - 1003
  • [50] Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering
    Xiao, Shaoning
    Li, Yimeng
    Ye, Yunan
    Chen, Long
    Pu, Shiliang
    Zhao, Zhou
    Shao, Jian
    Xiao, Jun
    NEURAL PROCESSING LETTERS, 2020, 52 (02) : 993 - 1003