Automatic Speechreading with Applications to Human-Computer Interfaces

被引：0

作者：

Xiaozheng Zhang

Charles C. Broun

Russell M. Mersereau

Mark A. Clements

机构：

[1] Georgia Institute of Technology,Center for Signal and Image Processing

[2] Motorola Human Interface Lab,undefined

来源：

EURASIP Journal on Advances in Signal Processing | / 2002卷

关键词：

automatic speechreading; visual feature extraction; Markov random fields; hidden Markov models; polynomial classifier; speech recognition; speaker verification;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

There has been growing interest in introducing speech as a new modality into the human-computer interface (HCI). Motivated by the multimodal nature of speech, the visual component is considered to yield information that is not always present in the acoustic signal and enables improved system performance over acoustic-only methods, especially in noisy environments. In this paper, we investigate the usefulness of visual speech information in HCI related applications. We first introduce a new algorithm for automatically locating the mouth region by using color and motion information and segmenting the lip region by making use of both color and edge information based on Markov random fields. We then derive a relevant set of visual speech parameters and incorporate them into a recognition engine. We present various visual feature performance comparisons to explore their impact on the recognition accuracy, including the lip inner contour and the visibility of the tongue and teeth. By using a common visual feature set, we demonstrate two applications that exploit speechreading in a joint audio-visual speech signal processing task: speech recognition and speaker verification. The experimental results based on two databases demonstrate that the visual information is highly effective for improving recognition performance over a variety of acoustic noise levels.

引用

共 50 条

[21] THE SOFTWARE ENGINEERING OF ADAPTIVE HUMAN-COMPUTER INTERFACES
NORCIO, AF
1989 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-3: CONFERENCE PROCEEDINGS, 1989, : 886 - 888
[22] Unsupervised learning of faces for human-computer interfaces
Raytchev, B
Murase, H
HUMAN-COMPUTER INTERACTION - INTERACT'01, 2001, : 779 - 780
[23] THE HUMAN-COMPUTER CONNECTION AN OVERVIEW OF BRAIN-COMPUTER INTERFACES
Millan, Jose del R.
METODE SCIENCE STUDIES JOURNAL, 2019, (09): : 135 - 141
[24] Applicability of human reliability assessment methods to human-computer interfaces
Hickling, E. M.
Bowie, J. E.
COGNITION TECHNOLOGY & WORK, 2013, 15 (01) : 19 - 27
[25] Natural and Tangible Human-Computer Interfaces for Augmented Environments
Sales Dias, Jose Miguel
SIGDOC'08: PROCEEDINGS OF THE 26TH ACM INTERNATIONAL CONFERENCE ON DESIGN OF COMMUNICATION, 2008, : 181 - 182
[26] 3D AUDIO IN HUMAN-COMPUTER INTERFACES
Sodnik, Jaka
Kos, Andrej
Tomazic, Saso
2014 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON), 2014,
[27] NATURAL-LANGUAGE IN MULTIMODAL HUMAN-COMPUTER INTERFACES
STOCK, O
IEEE EXPERT-INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1994, 9 (02): : 40 - 44
[28] Usability of graphical icons in the design of human-computer interfaces
Grobelny, J
Karwowski, W
Drury, C
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2005, 18 (02) : 167 - 182
[29] Informal user interfaces for natural human-computer interaction
Landay, JA
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (03): : 14 - 16
[30] Information assurance and advanced human-computer interfaces Preface
Vitabile, Salvatore
Gentile, Antonio
MOBILE INFORMATION SYSTEMS, 2008, 4 (03) : 163 - 164

← 1 2 3 4 5 →