Efficient video coding based on audio-visual focus of attention

被引：26

作者：

Lee, Jong-Seok ^{[1
]}

De Simone, Francesca ^{[1
]}

Ebrahimi, Touradj ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, Inst Elect Engn, Multimedia Signal Proc Grp MMSPG, CH-1015 Lausanne, Switzerland

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2011年 / 22卷 / 08期

关键词：

Video coding; Audio-visual focus of attention; Quality of experience; Audio-visual source localization; H.264/AVC; Flexible macroblock ordering (FMO); Canonical correlation analysis; Subjective quality assessment; MULTIMODAL SPEAKER DETECTION; SPATIAL ATTENTION; TRACKING; LINKS; INTEGRATION; FOVEATION;

D O I：

10.1016/j.jvcir.2010.11.002

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes an efficient video coding method using audio-visual focus of attention, which is based on the observation that sound-emitting regions in an audio-visual sequence draw viewers' attention. First, an audio-visual source localization algorithm is presented, where the sound source is identified by using the correlation between the sound signal and the visual motion information. The localization result is then used to encode different regions in the scene with different quality in such a way that regions close to the source are encoded with higher quality than those far from the source. This is implemented in the framework of H.264/AVC by assigning different quantization parameters for different regions. Through experiments with both standard and high definition sequences, it is demonstrated that the proposed method can yield considerable coding gains over the constant quantization mode of H.264/AVC without noticeable degradation of perceived quality. (C) 2010 Elsevier Inc. All rights reserved.

引用

页码：704 / 711

页数：8

共 50 条

[1] VIDEO CODING BASED ON AUDIO-VISUAL ATTENTION
Lee, Jong-Seok
De Simone, Francesca
Ebrahimi, Touradj
ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 57 - 60
[2] Subjective Quality Evaluation of Foveated Video Coding Using Audio-Visual Focus of Attention
Lee, Jong-Seok
De Simone, Francesca
Ebrahimi, Touradj
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (07) : 1322 - 1331
[3] Attention-Based Audio-Visual Fusion for Video Summarization
Fang, Yinghong
Zhang, Junpeng
Lu, Cewu
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 328 - 340
[4] A audio-visual model for efficient video summarization
El-Nagar, Gamal
El-Sawy, Ahmed
Rashad, Metwally
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
[5] Efficient Video Coding in H.264/AVC by using Audio-Visual Information
Lee, Jong-Seok
Ebrahimi, Touradj
2009 IEEE INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2009), 2009, : 402 - 407
[6] Audio-Visual Glance Network for Efficient Video Recognition
Nugroho, Muhammad Adi
Woo, Sangmin
Lee, Sumin
Kim, Changick
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10116 - 10125
[7] Audio-visual and EEG-based Attention Modeling for Extraction of Affective Video Content
Mehmood, Irfan
Sajjad, Muhammad
Baik, Sung Wook
Rho, Seungmin
2015 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2015, : 17 - 18
[8] VIDEO AND EDUCATIONAL ROBOTICS: AN INNOVATIVE INTEGRATION OF AUDIO-VISUAL LANGUAGE AND CODING
Denicolai, Lorenzo
Grimaldi, Renato
Palmieri, Silvia
INTED2016: 10TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2016, : 2617 - 2624
[9] Audio-visual speech processing and attention
Sams, M
PSYCHOPHYSIOLOGY, 2003, 40 : S5 - S6
[10] Audio-Visual Salieny Network with Audio Attention Module
Cheng, Shuaiyang
Gao, Xing
Song, Liang
Xiahou, Jianbing
PROCEEDINGS OF 2021 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS '21), 2021,

← 1 2 3 4 5 →