共 50 条
- [21] Audio-Visual Clustering for 3D Speaker Localization MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 86 - 97
- [22] Candidate Speech Extraction from Multi-speaker Single-Channel Audio Interviews SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 210 - 221
- [23] Multi-speaker DoA Estimation Using Audio and Visual Modality Neural Processing Letters, 2023, 55 : 8887 - 8901
- [24] Exploiting the Complementarity of Audio and Visual Data in Multi-Speaker Tracking 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 446 - 454
- [25] The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2247 - 2250
- [27] MULTI-SCALE HYBRID FUSION NETWORK FOR MANDARIN AUDIO-VISUAL SPEECH RECOGNITION 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 642 - 647
- [29] Multimodal Learning Using 3D Audio-Visual Data or Audio-Visual Speech Recognition 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 40 - 43
- [30] CLeLfPC: a Large Open Multi-Speaker Corpus of French Cued Speech LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 987 - 994