共 50 条
- [31] Audio-visual modeling for bimodal speech recognition 2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 181 - 186
- [32] Bimodal fusion in audio-visual speech recognition 2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 964 - 967
- [33] Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
- [34] An Attention Based Speaker-Independent Audio-Visual Deep Learning Model for Speech Enhancement MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 722 - 728
- [35] Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2120 - 2124
- [36] A CLOSER LOOK AT AUDIO-VISUAL MULTI-PERSON SPEECH RECOGNITION AND ACTIVE SPEAKER SELECTION 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6863 - 6867
- [38] Audio-visual continuous speech recognition using mpeg-4 compliant visual features 2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 960 - 963
- [40] Audio-visual fuzzy fusion for robust speech recognition 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,