A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression

被引:764
|
作者
Guo, Chenlei [1 ]
Zhang, Liming [1 ]
机构
[1] Fudan Univ, Dept Elect Engn, Shanghai 200433, Peoples R China
关键词
Hierarchical selectivity (HS); multiresolution wavelet domain foveation (MWDF); phase spectrum of Fourier transform (PFT); phase spectrum of quaternion Fourier transform (PQFT); receiver operating characteristic (ROC) curve; spatiotemporal saliency map; visual attention; VISUAL-ATTENTION; FOURIER-TRANSFORMS; REGIONS; IMPLEMENTATION; HYPERCOMPLEX; QUATERNION; FOVEATION; SELECTION; DRIVEN; SHIFTS;
D O I
10.1109/TIP.2009.2030969
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Salient areas in natural scenes are generally regarded as areas which the human eye will typically focus on, and finding these areas is the key step in object detection. In computer vision, many models have been proposed to simulate the behavior of eyes such as SaliencyToolBox (STB), Neuromorphic Vision Toolkit (NVT), and others, but they demand high computational cost and computing useful results mostly relies on their choice of parameters. Although some region-based approaches were proposed to reduce the computational complexity of feature maps, these approaches still were not able to work in real time. Recently, a simple and fast approach called spectral residual (SR) was proposed, which uses the SR of the amplitude spectrum to calculate the image's saliency map. However, in our previous work, we pointed out that it is the phase spectrum, not the amplitude spectrum, of an image's Fourier transform that is key to calculating the location of salient areas, and proposed the phase spectrum of Fourier transform (PFT) model. In this paper, we present a quaternion representation of an image which is composed of intensity, color, and motion features. Based on the principle of PFT, a novel multiresolution spatiotemporal saliency detection model called phase spectrum of quaternion Fourier transform (PQFT) is proposed in this paper to calculate the spatiotemporal saliency map of an image by its quaternion representation. Distinct from other models, the added motion dimension allows the phase spectrum to represent spatiotemporal saliency in order to perform attention selection not only for images but also for videos. In addition, the PQFT model can compute the saliency map of an image under various resolutions from coarse to fine. Therefore, the hierarchical selectivity (HS) framework based on the PQFT model is introduced here to construct the tree structure representation of an image. With the help of HS, a model called multiresolution wavelet domain foveation (MWDF) is proposed in this paper to improve coding efficiency in image and video compression. Extensive tests of videos, natural images, and psychological patterns show that the proposed PQFT model is more effective in saliency detection and can predict eye fixations better than other state-of-the-art models in previous literature. Moreover, our model requires low computational cost and, therefore, can work in real time. Additional experiments on image and video compression show that the HS-MWDF model can achieve higher compression rate than the traditional model.
引用
收藏
页码:185 / 198
页数:14
相关论文
共 50 条
  • [21] Video Saliency Detection Using the Propagation of Image Saliency between Frames
    Zhu, Shaotong
    Zhang, Yingtao
    Liang, Tian
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 459 - 467
  • [22] A Saliency Map Approach to Optimize VMAF for Video and Image Compression
    Zheng, Lin
    Han, Jingning
    Xu, Yaowu
    2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 293 - 301
  • [23] Superpixel-based video saliency detection via the fusion of spatiotemporal saliency and temporal coherency
    Li, Yandi
    Xu, Xiping
    Zhang, Ning
    Du, Enyu
    OPTICAL ENGINEERING, 2019, 58 (08)
  • [24] GRAPH-THEORETIC SPATIOTEMPORAL CONTEXT MODELING FOR VIDEO SALIENCY DETECTION
    Wei, Lina
    Wang, Fangfang
    Li, Xi
    Wu, Fei
    Xiao, Jun
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4197 - 4201
  • [25] STI-Net: Spatiotemporal integration network for video saliency detection
    Zhou, Xiaofei
    Cao, Weipeng
    Gao, Hanxiao
    Ming, Zhong
    Zhang, Jiyong
    INFORMATION SCIENCES, 2023, 628 : 134 - 147
  • [26] Video Saliency Detection Using Multi-level Spatiotemporal Orientation
    Liu, Zhao
    Wang, Zhenyang
    Song, Xinhui
    Chen, Chun
    2015 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2015,
  • [27] Unsupervised Uncertainty Estimation Using Spatiotemporal Cues in Video Saliency Detection
    Alshawi, Tariq
    Long, Zhiling
    AlRegib, Ghassan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (06) : 2818 - 2827
  • [28] A spatiotemporal weighted dissimilarity-based method for video saliency detection
    Duan, Lijuan
    Xi, Tao
    Cui, Song
    Qi, Honggang
    Bovik, Alan C.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2015, 38 : 45 - 56
  • [29] Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart
    Kim, Hansang
    Kim, Youngbae
    Sim, Jae-Young
    Kim, Chang-Su
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (08) : 2552 - 2564
  • [30] Improving Video Saliency Detection via Localized Estimation and Spatiotemporal Refinement
    Zhou, Xiaofei
    Liu, Zhi
    Gong, Chen
    Liu, Wei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (11) : 2993 - 3007