Real time face detection for multimodal speech recognition

被引:0
|
作者
Murai, K [1 ]
Nakamura, S [1 ]
机构
[1] Fuji Xerox, Informat Media Lab, Kanagawa 2590157, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a real time system to detect the speaker's frontal face for multimodal speech recognition. It is widely acknowledged that automatic speech recognizers, as well as humans, can improve recognition performance by adding visual modality, i.e., the speaker's facial image to audio modality([1][2]). Visual modality also provides inaudible information, such as the speaker's facial orientation([3]), and the location of the mouth. To acquire this information, we have to localize the speaker's face in real time. Our system is a combination of skin color detection and spatial feature detection. The color-based detection is fast but depends on the skin and the background color, while the special feature detection requires more computation. We applied color-based pruning to reduce the search space for the spatial feature detection. By detecting the facial orientation, the proposed method functions as a "Face to Talk" switch in place of the "Push to Talk" switch. In our experiment, pruning based on color reduced 53-97% of the search space, and 98.9% of the frontal face was detected correctly by the subsequent spatial detector.
引用
收藏
页码:A373 / A376
页数:4
相关论文
共 50 条
  • [1] Speech ReaLLM - Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time
    Seide, Frank
    Doulaty, Morrie
    Shi, Yangyang
    Gaur, Yashesh
    Jia, Junteng
    Wu, Chunyang
    INTERSPEECH 2024, 2024, : 1900 - 1904
  • [2] Face Detection and Posture Recognition in a Real Time Tracking System
    Chung, Hung-Yuan
    Hou, Chun-Cheng
    Liang, Shou-Jyun
    2017 IEEE INTERNATIONAL SYMPOSIUM ON SYSTEMS ENGINEERING (ISSE 2017), 2017, : 100 - 105
  • [3] Face recognition with real time eye lid movement detection
    Sukri, Syazwan Syafiqah
    Ruhaiyem, Nur Intan Raihana
    Mohamed, Ahmad Sufril Azlan
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, 10645 LNCS : 352 - 363
  • [4] RECOGNITION OF SPEECH IN REAL TIME
    FIEVET, F
    MAISSIS, A
    WALRAVE, P
    AUTOMATISME, 1970, 15 (01): : 3 - &
  • [5] Real Time Speech - Interactive Bomb Disposal Robot With Face and Object Recognition
    Rakshith
    Prakash, Prithvi
    Fernandes, Sandesh
    Rahul, Syed
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 828 - 839
  • [6] Real-time Integrated Face Detection and Recognition on Embedded GPGPUs
    Yi, Saehanseul
    Yoon, Illo
    Oh, Chanyoung
    Yi, Youngmin
    2014 IEEE 12TH SYMPOSIUM ON EMBEDDED SYSTEMS FOR REAL-TIME MULTIMEDIA (ESTIMEDIA), 2014, : 98 - 107
  • [7] A Learning Framework for Target Detection and Human Face Recognition in Real Time
    Huang, Jiaxing
    Yuan, Zhengnan
    Zhou, Xuan
    INTERNATIONAL JOURNAL OF TECHNOLOGY AND HUMAN INTERACTION, 2019, 15 (03) : 63 - 76
  • [8] Face detection and recognition in real-time automated attendance system
    Nagi, Gawed M.
    Rahmat, Rahmita O.K.
    Khalid, Fatimah
    Abdullah, Muhamad T.
    International Review on Computers and Software, 2012, 7 (03) : 959 - 964
  • [9] Real Time Human Face Detection and Recognition Based on Haar Features
    Arfi, Asif Mohammed
    Bal, Debasish
    Hasan, Mohammad Anisul
    Islam, Naeemul
    Arafat, Yasir
    2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 517 - 521
  • [10] Comparison of Face Detection and Recognition Algorithms in Real-Time Video
    Sarahi Sanchez-Moreno, Alejandra
    Manuel Perez-Meana, Hector
    Olivares-Mercado, Jesus
    Sanchez-Perez, Gabriel
    Toscano-Medina, Karina
    KNOWLEDGE INNOVATION THROUGH INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES (SOMET_20), 2020, 327 : 209 - 220