Real time face detection for multimodal speech recognition

被引:0
|
作者
Murai, K [1 ]
Nakamura, S [1 ]
机构
[1] Fuji Xerox, Informat Media Lab, Kanagawa 2590157, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a real time system to detect the speaker's frontal face for multimodal speech recognition. It is widely acknowledged that automatic speech recognizers, as well as humans, can improve recognition performance by adding visual modality, i.e., the speaker's facial image to audio modality([1][2]). Visual modality also provides inaudible information, such as the speaker's facial orientation([3]), and the location of the mouth. To acquire this information, we have to localize the speaker's face in real time. Our system is a combination of skin color detection and spatial feature detection. The color-based detection is fast but depends on the skin and the background color, while the special feature detection requires more computation. We applied color-based pruning to reduce the search space for the spatial feature detection. By detecting the facial orientation, the proposed method functions as a "Face to Talk" switch in place of the "Push to Talk" switch. In our experiment, pruning based on color reduced 53-97% of the search space, and 98.9% of the frontal face was detected correctly by the subsequent spatial detector.
引用
收藏
页码:A373 / A376
页数:4
相关论文
共 50 条
  • [31] Real-time fault detection in manufacturing environments using face recognition techniques
    Megahed, Fadel M.
    Camelio, Jaime A.
    JOURNAL OF INTELLIGENT MANUFACTURING, 2012, 23 (03) : 393 - 408
  • [32] A real-time face detection and recognition system for a mobile robot in a complex background
    Chen, Song
    Zhang, Tao
    Zhang, Chengpu
    Cheng, Yu
    ARTIFICIAL LIFE AND ROBOTICS, 2010, 15 (04) : 439 - 443
  • [33] Randomized Trees for Real-Time One-Step Face Detection and Recognition
    Belle, Vaishak
    Deselaers, Thomas
    Schiffer, Stefan
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3547 - 3550
  • [34] Software development framework for real-time face detection and recognition in mobile devices
    Rai L.
    Wang Z.
    Rodrigo A.
    Deng Z.
    Liu H.
    International Journal of Interactive Mobile Technologies, 2020, 14 (04) : 103 - 120
  • [35] Research on the Real-time Multiple Face Detection, Tracking and Recognition Based on Video
    Sang, Haifeng
    Xu, Chao
    Wu, Danyang
    Huang, Jing
    MECHATRONICS, ROBOTICS AND AUTOMATION, PTS 1-3, 2013, 373-375 : 442 - 446
  • [36] Near real-time face detection and recognition using a wireless camera network
    Nicolo, Francesco
    Parupati, Srikanth
    Kulathumani, Vinod
    Schmid, Natalia A.
    SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XXI, 2012, 8392
  • [37] Metamorphic Testing for Edge Real-Time Face Recognition and Intrusion Detection Solution
    Raif, Mourad
    Ouafiq, El Mehdi
    El Rharras, Abdessamad
    Chehri, Abdellah
    Saadane, Rachid
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [38] Robust endpoint detection and energy normalization for real-time speech and speaker recognition
    Li, Q
    Zheng, JS
    Tsai, A
    Zhou, QR
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (03): : 146 - 157
  • [39] Robust end-of-utterance detection for real-time speech recognition applications
    Hariharan, R
    Häkkinen, J
    Laurila, K
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 249 - 252
  • [40] Real-time single-view face detection and face recognition based on aggregate channel feature
    George, Michael
    Sivan, Aswathy
    Jose, Babita Roslind
    Mathew, Jimson
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2019, 11 (03) : 207 - 221