A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression

被引：764

作者：

Guo, Chenlei ^{[1
]}

Zhang, Liming ^{[1
]}

机构：

[1] Fudan Univ, Dept Elect Engn, Shanghai 200433, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2010年 / 19卷 / 01期

关键词：

Hierarchical selectivity (HS); multiresolution wavelet domain foveation (MWDF); phase spectrum of Fourier transform (PFT); phase spectrum of quaternion Fourier transform (PQFT); receiver operating characteristic (ROC) curve; spatiotemporal saliency map; visual attention; VISUAL-ATTENTION; FOURIER-TRANSFORMS; REGIONS; IMPLEMENTATION; HYPERCOMPLEX; QUATERNION; FOVEATION; SELECTION; DRIVEN; SHIFTS;

D O I：

10.1109/TIP.2009.2030969

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Salient areas in natural scenes are generally regarded as areas which the human eye will typically focus on, and finding these areas is the key step in object detection. In computer vision, many models have been proposed to simulate the behavior of eyes such as SaliencyToolBox (STB), Neuromorphic Vision Toolkit (NVT), and others, but they demand high computational cost and computing useful results mostly relies on their choice of parameters. Although some region-based approaches were proposed to reduce the computational complexity of feature maps, these approaches still were not able to work in real time. Recently, a simple and fast approach called spectral residual (SR) was proposed, which uses the SR of the amplitude spectrum to calculate the image's saliency map. However, in our previous work, we pointed out that it is the phase spectrum, not the amplitude spectrum, of an image's Fourier transform that is key to calculating the location of salient areas, and proposed the phase spectrum of Fourier transform (PFT) model. In this paper, we present a quaternion representation of an image which is composed of intensity, color, and motion features. Based on the principle of PFT, a novel multiresolution spatiotemporal saliency detection model called phase spectrum of quaternion Fourier transform (PQFT) is proposed in this paper to calculate the spatiotemporal saliency map of an image by its quaternion representation. Distinct from other models, the added motion dimension allows the phase spectrum to represent spatiotemporal saliency in order to perform attention selection not only for images but also for videos. In addition, the PQFT model can compute the saliency map of an image under various resolutions from coarse to fine. Therefore, the hierarchical selectivity (HS) framework based on the PQFT model is introduced here to construct the tree structure representation of an image. With the help of HS, a model called multiresolution wavelet domain foveation (MWDF) is proposed in this paper to improve coding efficiency in image and video compression. Extensive tests of videos, natural images, and psychological patterns show that the proposed PQFT model is more effective in saliency detection and can predict eye fixations better than other state-of-the-art models in previous literature. Moreover, our model requires low computational cost and, therefore, can work in real time. Additional experiments on image and video compression show that the HS-MWDF model can achieve higher compression rate than the traditional model.

引用

页码：185 / 198

页数：14

共 50 条

[1] A spatiotemporal model for video saliency detection
Kalboussi, Rahma
Abdellaoui, Mehrez
Douik, Ali
2016 SECOND INTERNATIONAL IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2016,
[2] A semiautomatic saliency model and its application to video compression
Lyudvichenko, Vitaliy
Erofeev, Mikhail
Gitman, Yury
Vatolin, Dmitriy
2017 13TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2017, : 403 - 410
[3] A Spatiotemporal Saliency Model for Video Surveillance
Tong Yubing
Cheikh, Faouzi Alaya
Guraya, Fahad Fazal Elahi
Konik, Hubert
Tremeau, Alain
COGNITIVE COMPUTATION, 2011, 3 (01) : 241 - 263
[4] A Spatiotemporal Saliency Model for Video Surveillance
Tong Yubing
Faouzi Alaya Cheikh
Fahad Fazal Elahi Guraya
Hubert Konik
Alain Trémeau
Cognitive Computation, 2011, 3 : 241 - 263
[5] Spatiotemporal Saliency Detection Using Textural Contrast and Its Applications
Kim, Wonjun
Kim, Changick
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (04) : 646 - 659
[6] Spatiotemporal Saliency Detection and Its Applications in Static and Dynamic Scenes
Kim, Wonjun
Jung, Chanho
Kim, Changick
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (04) : 446 - 456
[7] Spatiotemporal cue fusion-based saliency extraction and its application in video compression
Li K.
Luo Z.
Zhang T.
Ruan Y.
Zhou D.
Cognitive Robotics, 2022, 2 : 177 - 185
[8] Video Saliency Detection Using Spatiotemporal Cues
Chen, Yu
Xiao, Jing
Hu, Liuyi
Chen, Dan
Wang, Zhongyuan
Li, Dengshi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09): : 2201 - 2208
[9] SPATIOTEMPORAL UTILIZATION OF DEEP FEATURES FOR VIDEO SALIENCY DETECTION
Le, Trung-Nghia
Sugimoto, Akihiro
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
[10] A New Method for Spatiotemporal Textual Saliency Detection in Video
Shan, Susu
Xu, Hailiang
Su, Feng
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3240 - 3245

← 1 2 3 4 5 →