AVSE CHALLENGE: AUDIO-VISUAL SPEECH ENHANCEMENT CHALLENGE

被引:4
|
作者
Blanco, Andrea Lorena Aldana [1 ]
Valentini-Botinhao, Cassia [1 ]
Klejch, Ondrej [1 ]
Gogate, Mandar [2 ]
Dashtipour, Kia [2 ]
Hussain, Amir [2 ]
Bell, Peter [1 ]
机构
[1] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[2] Edinburgh Napier Univ, Edinburgh, Midlothian, Scotland
来源
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年
基金
英国工程与自然科学研究理事会;
关键词
Audio-visual speech enhancement; subjective intelligibility; LRS3; dataset;
D O I
10.1109/SLT54892.2023.10023284
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.
引用
收藏
页码:465 / 471
页数:7
相关论文
共 50 条
  • [21] Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement
    Zheng, Rui-Chen
    Ai, Yang
    Ling, Zhen-Hua
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1430 - 1444
  • [22] THE IMPACT OF REMOVING HEAD MOVEMENTS ON AUDIO-VISUAL SPEECH ENHANCEMENT
    Kang, Zhiqi
    Sadeghi, Mostafa
    Horaud, Radu
    Alameda-Pineda, Xavier
    Donley, Jacob
    Kumar, Anurag
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7302 - 7306
  • [23] Application for Real-time Audio-Visual Speech Enhancement
    Gogate, Mandar
    Dashtipour, Kia
    Hussain, Amir
    INTERSPEECH 2023, 2023, : 2026 - 2027
  • [24] Using Twin-HMM-Based Audio-Visual Speech Enhancement as a Front-End for Robust Audio-Visual Speech Recognition
    Abdelaziz, Ahmed Hussen
    Zeiler, Steffen
    Kolossa, Dorothea
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 867 - 871
  • [25] An audio-visual speech recognition with a new mandarin audio-visual database
    Liao, Wen-Yuan
    Pao, Tsang-Long
    Chen, Yu-Te
    Chang, Tsun-Wei
    INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
  • [26] Expressive audio-visual speech
    Bevacqua, E
    Pelachaud, C
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2004, 15 (3-4) : 297 - 304
  • [27] Effects of aging on audio-visual speech integration Effects of aging on audio-visual speech integration
    Huyse, Aurelie
    Leybaert, Jacqueline
    Berthommier, Frederic
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 136 (04): : 1918 - 1931
  • [28] Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art
    Liu, Mengting
    Zhou, Ying
    Wu, Yuwei
    Gao, Feng
    MACHINE INTELLIGENCE RESEARCH, 2024, 21 (01) : 4 - 28
  • [29] AUDIO-VISUAL WAKE WORD SPOTTING SYSTEM FOR MISP CHALLENGE 2021
    Xu, Yanguang
    Sun, Jianwei
    Han, Yang
    Zhao, Shuaijiang
    Mei, Chaoyang
    Guo, Tingwei
    Zhou, Shuran
    Xie, Chuandong
    Zou, Wei
    Li, Xiangang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9246 - 9250
  • [30] Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art
    Mengting Liu
    Ying Zhou
    Yuwei Wu
    Feng Gao
    Machine Intelligence Research, 2024, 21 : 4 - 28